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DRUG TARGETS XN CANDIDA AliBXCAKS 



The present invention is concerned with the 
identification of genes or functional fragments 
5 thereof from Candida albicans which are critical for 
growth and cell division and which genes may be used 
as selective drug targets to treat Candida albicans 
associated infections. Novel nucleic acid sequences 
from Candida albicans are also provided and which 

10 encode the polypeptides which are critical for growth 
of Candida albicans* 

Opportunistic infections in immunocompromised 
hosts represent an increasingly common cause of 
mortality and morbidity. Candida species are among 

15 the most commonly identified fungal pathogens 

associated with such opportunistic infections, with 
Candida albicans being the most common species. Such 
fungal infections are thus problematical in, for 
example, AIDS populations in addition to normal 

20 healthy women where Candida albicans yeasts represent 
the most common cause of vulvovaginitis. 

Although compounds do exist for treating such 
disorders, such as for example, amphotericin, these 
drugs are generally limited in their treatment because 

25 of their toxicity and side effects. Therefore, there 
exists a need for new compounds which may be used to 
treat Candida associated infections in addition to 
compounds which are selective in their action against 
Candida albicans, 

30 Classical approaches for identifying anti-fungal 

compounds have relied almost exclusively on inhibition 
of fungal or yeast growth as an endpoint. Libraries 
of natural products, semi-synthetic, or synthetic 
chemicals are screened for their ability to kill or 

35 arrest growth of the target pathogen or a related 
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nonpathogenic model organism. These tests are 
cumbersome and provide no information about a 
compounds mechanism of action. The promising lead 
compounds that emerge from such screens must then be 
5 tested for possible host-toxicity and detailed 

mechanism of action studies must subsequently be 
conducted to identify the affected molecular target. 

The present inventors have now identified a range 
of nucleic acid sequences from Candida albicans which 
10 encode polypeptides which are critical for its 

survival and growth. These sequences represent novel 

■■■■■■ --^^^i^fifi^eg^-iffHa^ ^ 

selectively identify compounds capable of inhibiting 
expression of such polypeptides and their potential 
15 use in alleviating diseases or conditions associated 
with Candida albicans infection. 

Therefore, according to a first aspect of the 
invention there is provided a nucleic acid molecule 
encoding a polypeptide which is critical for survival 
0 and growth of the yeast Candida albicans and which 

nucleic acid molecule comprises any of the sequences 
of nucleotides in Sequence ID Numbers 1 to 3, 5, 6, 8 
to 11, 13, 15, 16, 18, 20, 21, 23, 25 to 29, 31, 35, 
37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 
!5 63, 65, 67, 69 and 71. 

A further aspect of the invention comprises a 
nucleic acid molecule encoding a polypeptide which is 
critical for survival and growth of the yeast Candida 
albicans and which nucleic acid molecule comprises any 
0 of the sequences of Sequence ID Numbers 1, 28, 35, 3 7 
and 3 9 and fragments or derivatives of said nucleic 
acid molecules. 

Also provided by the present invention is a 
nucleic acid molecule encoding a polypeptide which is 
5 critical for survival and growth of the yeast Candida 



albicans and which polypeptide has an amino acid 
sequence according to the sequence of any of Sequence 
ID Numbers 4, 7, 12, 14, 17, 19, 22, 24, 30, 32 to 34, 
36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 
5 62, 64, 66, 68, 70 and 72. 

Letters utilised in the nucleic acid sequences 
according to the invention which are not recognisable 
as letters of the genetic code signify a position in 
the nucleic acid sequence where one or more of bases 
10 A, G, C or T can occupy the nucleotide position. 

Representative letters used to identify the range of 
bases which can be used are as follows: 



M: 


A 


or 


C 






R: 


A 


or 


G 






W: 


A 


or 


T 






S: 


C 


or 


G 






Y: 


C 


or 


T 






K: 


G 


or 


T 






V: 


A 


or 


C 


or 


G 


H: 


A 


or 


C 


or 


T 


D: 


A 


or 


G 


or 


T 


B: 


C 


or 


G 


or 


T 


N: 


G 


or 


A 


or 


T 



25 

In one embodiment of the above identified aspects 
of the invention the nucleic acid may comprise a mRNA 
molecule or alternatively a DNA and preferably a cDNA 
molecule. 

30 Also provided by the present invention is a 

nucleic acid molecule capable of hybridising to the 
nucleic acid molecules according to the invention 
under high stringency conditions. 

Stringency of hybridisation as used herein refers 

35 to conditions under which polynucleic acids are 
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stable. The stability of hybrids is reflected in the 
melting temperature (Tm) of the hybrids. Tm can be 
approximated by the formula: 

81.5"»C+16.6(log,„[Na*]+0.41 (%G&C) -6001/1 

wherein 1 is the length. of the hybrids in nucleotides. 
Tm decreases approximately by 1-1.5'C with every 1% 
decrease in sequence homology. 

The nucleic acid capable of hybridising to 
nucleic acid molecules according to the invention will 
generally be at least 70%, preferably at least 80 or 
90% and more preferably at least 95% homologous to the 
nucleotide sequences according to the invention. 

The DNA molecules according to the invention may, 
advantageously, be included in a suitable expression 
vector to express polypeptides encoded therefrom in a 
suitable host. 

The present invention also comprises within its 
scope proteins or polypeptides encoded by the nucleic 
acid molecules according to the invention or a 
functional equivalent, derivative or bioprecursor 
thereof . 

Therefore, according to a further aspect of the 
invention there is provided a polypeptide having an 
amino acid sequence of any of Sequence ID Numbers 4, 
7, 12, 14, 17, 19, 22, 24, 30, 32 to 34, 36, 38, 40, 
42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 
68, 70 and 72. A polypeptide encoded by the nucleic 
acid molecule according to the invention is also 
provided, which polypeptide preferably comprises an 
amino acid sequence of having the sequence of any of 
Sequence ID Numbers 4, 7, 12, 14, 17, 19, 22, 24, 30, 
32 to 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 
58, 60, 62, 64, 66, 68, 70 and 72. 



m 



An expression vector according to the invention 
includes a vector having a nucleic acid according to 
the invention operably linked to regulatory sequences, 
such as promoter regions, that are capable of 
5 effecting expression of said DNA fragments. The term 
"operably linked" refers to a juxta position wherein 
the components described are in a relationship 
permitting them to function in their intended manner. 
Such vectors may be transformed into a suitable host ^ 

10 cell to provide for expression of a polypeptide 
according to the invention. Thus, in a further 
aspect, the invention provides a process for preparing 
polypeptides according to the invention which 
comprises cultivating a host cell, transformed or 

15 transfected with an expression vector as described 
above under conditions to provide for expression by 
the vector of a coding sequence encoding the 
polypeptides, and recovering the expressed 
polypeptides . 

20 The vectors may be, for example, plasmid, virus 

or phage vectors provided with an origin of 
replication, optionally a promoter for the expression 
of said nucleotide and optionally a regulator of the 
promoter. The vectors may contain one or more 

25 selectable markers, such as, for example, ampicillin 
resistance. 

Polynucleotides according to the invention may be 
inserted into the vectors described in an antisense 
orientation in order to provide for the production of 
30 antisense RNA. Antisense RNA or other antisense 
nucleic acids may be produced by synthetic means. 

In accordance with the present invention, a 
defined nucleic acid includes not only the identical 
nucleic acid but also any minor base variations 
35 including in particular, substitutions in bases which 
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25 
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35 



result in a synonymous codon (a different codon 
specifying the same amino acid residue) due to the 
degenerate code in conservative amino acid 
substitutions. The term "nucleic acid sequence" also 
includes the complementary sequence to any single 
stranded sequence given regarding base variations. 

The present invention also advantageously 
provides nucleic acid sequences of at least 
approximately lo contiguous nucleotides of a nucleic 
acid according to the invention and preferably from 10 
to 50 nucleotides. These sequences may, 
advantageously be used as probes or primers to 
initiate replication, or the like. such nucleic acid 
sequences may be produced according to techniques well 
known in the art, such as by recombinant or synthetic 
means. They may also be used in diagnostic kits or 
the like for detecting the presence of a nucleic acid 
according to the invention. These tests generally 
comprise contacting the probe with the sample under 
hybridising conditions and detecting for the presence 
of any duplex or triplex formation between the probe 
and any nucleic acid in the sample. 

According to the present invention these probes 
may be anchored to a solid support. Preferably, they 
are present on an array so that multiple probes can 
simultaneously hybridize to a single biological 
sample. The probes can be spotted onto the array or 
synthesised in situ on the array. (See Lockhart et 
al.. Nature Biotechnology, vol. 14, December 1996 
-Expression monitoring by hybridisation to high 
density oligonucleotide arrays". a single array can 
contain more than 100, 500 or even 1,000 different 
probes in discrete locations. 

Advantageously, the nucleic acid sequences, 
according to the invention may be produced using such 



recombinant or synthetic means, such as for example 
using PCR cloning mechanisms which generally involve 
making a pair of primers, which may be from 
approximately 10 to 50 nucleotides to a region of the 
5 gene which is desired to be cloned, bringing the 

primers into contact with mRNA, cDNA, or genomic DNA 
from a hviman cell, performing a polymerase chain 
reaction under conditions which bring about 
amplification of the desired region, isolating the 
10 amplified region or fragment and recovering the 

amplified DNA. Generally, such techniques as defined 
herein are well known in the art, such as described in 
Sambrook et al (Molecular Cloning: a Laboratory 
Manual, 1989)- 

15 The nucleic acids or oligonucleotides according 

to the invention may carry a revealing label. 
Suitable labels include radioisotopes such as ^^P or 
^'s, enzyme labels or other protein labels such as 
biotin or fluorescent markers. Such labels may be 

20 added to the nucleic acids or oligonucleotides of the 
invention and may be detected using known techniques 
per se. 

The polypeptide or protein according to the 
invention includes all possible amino acid variants 

25 encoded by the nucleic acid molecule according to the 
invention including a polypeptide encoded by said 
molecule and having conservative amino acid changes. 
Polypeptides according to the invention further 
include variants of such sequences, including 

30 naturally occurring allelic variants which are 

substantially homologous to said polypeptides. In 
this context, substantial homology is regarded as a 
sequence which has at least 70%, preferably 80 or 90% 
amino acid homology with the polypeptides encoded by 

35 the nucleic acid molecules according to the invention. 
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A nucleic acid which is particularly advantageous 
is one comprising the sequences of nucleotides 
illustrated in Figures 1 which is specific to Candida 
albicans with no functionally related sequences in 
5 other prokaryotic or eukaryotic organism as yet 

identified from the respective genomic databases. 

Nucleotide sequences according to the invention 
are particularly advantageous for selective 
therapeutic targets for treating Candida albicans 
10 associated infections. For example, an antisense 

nucleic acid capable of binding to the nucleic acid 

-^aggja^gssfageeBdaaygsBfety-'^^ 

selectively inhibit expression of the corresponding 
polypeptides, leading to impaired growth of the 
15 Candida albicans with reductions of associated 
illnesses or diseases. 

The nucleic acid molecule or the polypeptide 
according to the invention may be used as a 
medicament, or in the preparation of a medicament, for 
treating diseases or conditions associated with 
Candida albicans infection. 

Advantageously, the nucleic acid molecule or the 
polypeptide according to the invention may be provided 
in a pharmaceutical composition together with a 
pharmaceutical ly acceptable carrier, diluent or 
excipient therefor. 

Antibodies to the protein or polypeptide of the 
present invention may, advantageously, be prepared by 
techniques which are known in the art. For example, 
polyclonal antibodies may be prepared by inoculating a 
host animal, such as a mouse, with the polypeptide 
according to the invention or an epitope thereof and 
recovering immune serum. Monoclonal antibodies may be 
prepared according to known techniques such as 
35 described by Kohler R. and Milstein C. , Nature (1975) 
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256, 495-497. 

Antibodies according to the invention may also be 
used in a method of detecting for the presence of a 
polypeptide according to the invention, which method 
5 comprises reacting the antibody with a sample and 

identifying any protein bound to said antibody. A kit 
may also be provided for performing said method which 
comprises an antibody according to the invention and 
means for reacting the antibody with said sample. 

10 Proteins which interact with the polypeptide of 

the invention may be identified by investigating 
protein-protein interactions using the two-hybrid 
vector system first proposed by Chien et al (1991) . 
This technique is based on functional 

15 reconstitution in vivo of a transcription factor which 
activates a reporter gene. More particularly the 
technique comprises providing an appropriate host cell 
with a DNA construct comprising a reporter gene under 
the control of a promoter regulated by a transcription 

20 factor having a DNA binding domain and an activating 

domain, expressing in the host cell a first hybrid DNA 
sequence encoding a first fusion of a fragment or all 
of a nucleic acid sequence according to the invention 
and either said DNA binding domain or said activating 

25 domain of the transcription factor, expressing in the 
host at least one second hybrid DNA sequence, such as 
a library or the like, encoding putative binding 
proteins to be investigated together with the DNA 
binding or activating domain of the transcription 

30 factor which is not incorporated in the first fusion; 
detecting any binding of the proteins to be 
investigated with a protein according to the invention 
by detecting for the presence of any reporter gene 
product in the host cell; optionally isolating second 

35 hybrid DNA sequences encoding the binding protein. 
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An example of such a technique utilises the GAL4 
protein in yeast. GAL4 is a transcriptional activator 
of galactose metabolism in yeast and has a separate 
domain for binding to activators upstream of the 
galactose metabolising genes as well as a protein 
binding domain. Nucleotide vectors may be 
constructed, one of which comprises the nucleotide 
residues encoding the DNA binding domain of GAL4 . 
These binding domain residues may be fused to a known 
protein encoding sequence, such as for example the 
nucleic acids according to the invention » The other 
vector comprises the residues encoding the protein 
binding domain of GAL4 , These residues are fused to 
residues encoding a test protein- Any interaction 
between polypeptides encoded by the nucleic acid 
according to the invention and the protein to be 
tested leads to transcriptional activation of a 
reporter molecule in a GAL-4 transcription deficient 
yeast cell into which the vectors have been 
transformed. Preferably, a reporter molecule such as 
6-galactosidase is activated upon restoration of 
transcription of the yeast galactose metabolism genes. 

Further provided by the present invention is one 
or more Candida albicans cells comprising an induced 
mutation in the DNA sequence encoding the polypeptide 
according to the invention. 

A further aspect of the invention provides a 
method of identifying compounds which selectively 
inhibit or interfere with the expression, or the 
functionality of polypeptides expressed from the 
nucleotides sequences according to the invention or 
the metabolic pathways in which these polypeptides are 
involved and which are critical for growth and 
survival of Candida albicans, which method comprises 
(a) contacting a compound to be tested with one or 
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more Candida albicans cells having a mutation in a 
nucleic acid molecule according to the invention which 
mutation results in overexpression or under express ion 
of said polypeptides in addition to one or more wild 
5 type Candida cells, (b) monitoring the growth and/ or 
activity of said mutated cell compared to said wild 
type wherein differential growth or activity of said 
one or more mutated Candida cells provides an 
indication of selective action of said compound on 

10 said polypeptide or another polypeptide in the same or 
a parallel pathway. 

Compounds identifiable or identified using the 
method according to the invention, may advantageously 
be used as a medicament, or in the preparation of a 

15 medicament to treat diseases or conditions associated 
with Candida albicans infection. These compounds may 
also advantageously be included in a pharmaceutical 
composition together with a pharmaceutical ly 
acceptable carrier, diluent or excipient therefor. 

20 A further aspect of the invention provides a 

method of identifying DNA sequences from a cell or 
organism which DNA encodes polypeptides which are 
critical for growth or survival, which method 
comprises (a) preparing a cDNA or genomic library from 

25 said cell or organism in a suitable expression vector 
which vector is such that it can either integrate into 
the genome in said cell or that it permits 
transcription of antisense RNA from the nucleotide 
sequences in said cDNA or genomic library, (b) 

3 0 selecting transf ormants exhibiting impaired growth and 
determining the nucleotide sequence of the cDMA or 
genomic sequence from the library included in the 
vector from said transf ormant . Preferably, the cell 
or organism may be any yeast or filamentous fungi, 

3 5 such as, for example, Saccharomyces cervisiae , 
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Saccharomyces pombe or Candida albicans. 

A further aspect of the invention provides a 
pharmaceutical composition comprising a compound 
according to the invention together with a 
pharmaceutically acceptable carrier, diluent or 
excipient therefor. 

A further aspect of the invention comprises 
nucleic acid molecules encoding proteins which are 
critical for survival and growth of Candida albicans, 
which nucleic acid molecules comprise any of the 
sequences illustrated in Figures 5 to 29. 
Polypeptides which are critical for survival and 
growth of Candida albicans are also encompassed within 
the present invention, and which polypeptides comprise 
any of the amino acid sequences illustrated in Figures 
29 to 39. 

The present invention may be more clearly 
understood with reference to the accompanying example, 
which is purely exemplary, with reference to the 
20 accompanying drawings wherein: 

Figure 1: is a diagrammatic representation 

of plasmid pGALlPNiST-l . 

25 Figure 2: is a nucleotide sequence of 

plasmid pGALlPNiST-1 of Figure 1. 

Figure 3 : is a diagrammatic representation 

of plasmid pGALlPSiST-1 . 



15 



30 



Figure 4: is a nucleotide sequence of 

plasmid pGALlPSiST-1 of Figure 3. 



35 



Figures 5 to 28: illustrate the nucleotide 

sequences of oligonucleotides 
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encoding polypeptides of 
previously unknown function 
isolated from Candida albicans 
which are critical for its 
survival and growth, according to 
the invention. 



DESC:: 
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Figures 29 to 39: 



illustrate the amino acid 
sequences of polypeptides from 
Candida albicans which are 
critical for its survival and 
growth, according to the 
invention • 



15 Example 1 

Identification of novel drug targets in C. 
albicans by anti-sense and disruptive integration 

The principle of the approach is based on the 
fact that when a particular C. albicans mRNA is 
20 inhibited by producing the complementary anti-sense 

RNA, the corresponding protein will decrease. If this 
protein is critical for growth or survival, the cell 
producing the anti-sense RNA will grow more slowly or 
will die. 

25 Since anti-sense inhibition occurs at mRNA level, 

the gene copy number is irrelevant, thus allowing 
applications of the strategy even in diploid 
organisms. 

Anti-sense RNA is endogenously produced from an 
30 integrative or episomal plasmid with an inducible 
promoter; induction of the promoter leads to the 
production of a RNA encoded by the insert of the 
plasmid. This insert will differ from one plasmid to 
another in the library. The inserts will be derived 
35 from genomic DNA fragments or from cDNA to cover-to 
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the extent possible- the entire genome. 

The vector is a proprietary vector allowing 
integration by homologous recombination at either the 
homologous insert or promoter sequence in the Candida 
genome. After introducing plasmids from cDNA or 
genomic libraries into C. albicans, transf ormants are 
screened for impaired growth after promoter (& thus 
anti-sense) induction in the presence of lithium 
acetate. Lithium acetate prolongs the Gi phase and 
thus allows anti-sense to act during a prolonged 
period of time during the cell cycle. Transf ormants 
which show impaired growth in both induced and non- 
induced media, thus showing a growth defect due to 
integrative disruption, are selected as well. 

Transf ormants showing impaired growth are 
supposed to contain plasmids which produce anti-sense 
RNA to mRNAs critical for growth or survival. Growth 
is monitored by measuring growth-curves over a period 
of time in a device (Bioscreen Analyzer, Labsystems) 
which allows simultaneous measurement of growth-curves 
of 200 transf ormants . 

Subsequently plasmids can be recovered from the 
transf ormants and the sequence of their inserts 
determined, thus revealing which mRNA they inhibit. In 
order to be able to recover the genomic or cDNA insert 
Which has integrated into the Candida genome, genomic 
DNA is isolated, cut with an enzyme which cuts only 
once into the library vector (and estimated approx. 
every 4096 bp in the genome) and religated. PGR with 
30 primers flanking the insert will yield (partial) 

genomic or cDNA inserts as PCR fragments which can 
directly be sequenced. This PGR analysis (on ligation 
reaction) will also show us how many integrations 
occurred. Alternatively the ligation reaction is 
35 transformed to E. coli and PGR analysis is performed 
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on colonies or on plasmid DNA derived thereof. 

This method is employed for a genome wide search 
for novel albicans genes which are important for 
growth or survival. 

5 

Materials & Methods 
Construction of pGallPNiST-l 

The backbone of the pGALlPNiST-1 vector 
(integrative anti-sense Sfil-Notl vector) is 

10 pGEMllZf(+) (Promega Inc.)- First, the CaMAL2 

EcoRl/Sall promoter fragment from pDBVSO (D.H. Brown 
et al.) was ligated into TcoRI /Sal I -opened 
pGEMllZf(+) resulting in the intermediate construct 
PGEMMAL2P-1. Into the latter (MscI/CIP) the CaURA3 

15 selection marker was cloned as a Eco47XZl/Xmnl 

fragment derived from pRM2 . The resulting pGEMMAL2P-2 
vector was Notl/HindTXl opened in order to accept the 
Notl-stuf fer-Sfil cassette from pPCKlNiSCYCT-1 
(Eagl/Hindlll fragment): pMAL2PNiST-l . Finally, the 

20 plasmid pGALlPNiST-1 was constructed by exchanging the 
iSaII/£rcll36II MAL2 promoter in pMAL2PNiST-l by the 
Xhol/Smal GALl promoter fragment derived from 
PRM2GAL1P. 

25 Construction of pGallPSiST-1 

The vector pGALlPSiST-1 was created for cloning 
the small genomic DNA fragments (flanked by Sfil 
sites) behind the GALl promoter. The only difference 
with pGALlPNiST-1 is that the hlFNp (stuffer fragment) 

30 insert fragment in pGALlPSiST-1 is flanked by two Sfil 
sites in stead of a Sfil and a NotI site as in 
pGALlPNiST-1. To construct pGALlPSiST-1 the ^coRI- 
Hindlll fragment, containing hlFNp flanked by a Sfil 
and a NotI site, of pMAL2pHiET-3 (unpublished) was 

35 exchanged by the EcoRl-Hindlll fragment, containing 
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hlFNP flanked by two Sfij sites, from YCp50S-S (an E. 
coli / s. cBrevisiae shuttle vector derived from the 
plasmid YCpSO, which is deposited in the ATCC 
collection (number 37419; Thrash et al . , 1985); an 
EcoRl-Hindlll fragment, containing the gene hlFWp, 
which is flanked by two Sfil sites, was inserted in 
YCp50, creating YCp50S-S) , resulting into plasmid 
pMAL2PSiST-l. The mal2 promoter from pMAL2PSiST-l (by 
a Nael-Fspl digest) was further replaced by the gall 
promoter from pGALlPNiST-l (via a XhoI-SalT digest) , 
creating the vector pGALlPSiST-1 . 

Candida albicans genomic library 
* Preparation of the genomic DNA fragments 
A Candida albicans genomic DNA library with small DNA 
fragments (400 to 1,000 bp) was prepared. Genomic DNA 
of Candida albicans B2 63 0 was isolated following a 
modified protocol of Blin and Stafford (1976). The 
quality of the isolated genomic DNA was checked by gel 
electrophoresis. Undigested DNA was located on the gel 
above the marker band of 26,282 bp. A little smear, 
caused by fragmentation of the DNA, was present. 
To obtain enrichment for genomic DNA fragments of the 
desired size, the genomic DNA was partially digested. 
Several restriction enzymes (AIul, Haelll and Rsalj 
all creating blunt ends) were tried out. The 
appropriate digest conditions have been determined by 
titration of the enzyme. Enrichment of small DNA 
fragments was obtained with 70 units of AIuI on 10 /xg 
of genomic DNA for 2 0 min. T4 DNA polymerase 
(Boehringer) and dNTPs (Boehringer) were added to 
polish the DNA ends. After extraction with phenol- 
chloroform the digest was size-fractionated on an 
agarose gel. The genomic DNA fragments with a length 
of 500 to 1,250 bp were eluted from the gel by 
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centrifugal filtration (Zhu et al . , 1985). StiZ 
adaptors (5' GTTGGCCTTTT) or (5' AGGCCAAC) were 
attached to the DNA ends (blunt) to facilitate cloning 
of the fragments into the vector. Therefore, a 8-iaer 
and ll-mer oligonucleotide (comprising the Sfil site) 
were kinated and annealed. After ligation of these 
adaptors to the DNA fragments a second size- 
f ractionation was performed on an agarose gel. The 
DNA fragments of 4 00 to 1150 bp were eluted from the 
gel by centrifugal filtration. 

* Preparation of the pGALlPSiST-l vector fragment 
The small genomic DNA fragments were cloned after 

the GALl promoter in the vector pGALlPSIST-1 • Qiagen- 
purified pGALlPSiST-l plasmid DNA was digested with 
Sfil and the largest vector fragment eluted from the 
gel by centrifugal filtration (Zhu et al . , 1985). 
Ligation with a control DNA fragment, flanked by Sfil 
sites, was performed as a control. The ligation mix 
was electroporated to MC1061 E. coli cells. Plasmid 
DNA of 24 clones was analyzed. In all cases the 
control fragment was inserted in the pGALlPSiST-1 
vector f r agment . 

* Upscaling 

All genomic DNA fragments (450 ng) were ligated 
into the pGALlPSiST-1 vector (20 ng) . After 
electroporation at 2500V, 40/^F circa 400,000 clones 
were obtained. These clones were pooled into three 
groups and stored as glycerol slants. Also Qiagen- 
purified DNA was prepared from these clones. A clone 
analysis showed an average insert length of 600 bp and 
a percentage of 91 for clones with an insert. The size 
of the library corresponds to 5 times the diploid 
genome. The genomic DNA inserts are sense or anti- 
sense orientated in the vector. 
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Candida albiceuis cDNA library 

Total RNA was extracted from Candida albicans 
B2 63Q grown on respectively minimal (SD) and rich 
(YPD) medium as described by Chirgwin et al in 
Sambrook et a J . mRNA was prepared from total RNA 
using the Invitrogen Fast Track procedure. 

First strand cDNA is synthesised with the 
superscript Reverse Transcriptase (BRL) and with an 
oligo dT-iffotI Primer adapter. After second strand 
synthesis, cDNA is polished with Klenow enzyme and 
purified over a Sephacryl S-400 spun column. 
Phosphorylated Sfii adapters are then ligated to the 
CDNA, followed by digestion with the WotI restriction 
enzyme. The Sfil/Natl cDNA is then purified and sized 
15 on a Biogel column A150M. 

First fraction contains approximately 38,720 
clones by transformation, the second fraction only 
1540 clones. Clone analysis: 

Fr. I: 22/24 inserts, 16 > looo bp, 4 ^ 2000 bp, 
20 average size: 1500 bp, 

Fr. II: 9/12 inserts, 3 > 1000 bp, average size: 960 
bp CDNA was ligated in a Notl/sfil opened pGALlPNiST-1 
vector (anti-sense) 

Candida transformation 

The host strain used for transformation is a C. 
albicans ura3 mutant, CAI-4, which contains a deletion 
in orotidine-5'-phosphate decarboxylase and was 
obtained from William Fonzi, Georgetown University 
(Fonzi and Irwin) . CAI-4 was transformed with the 
above described cDNA library or genomic library using 
the Pichia spheroplast module (Invitrogen) . Resulting 
transformants were plated on minimal medium 
supplemented, with glucose (SO, 0.67% or 1.34% Yeast 
Nitrogen base w/o amino acids +2% glucose) plates 
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and incubated for 2-3 days at 3 0 °C. 
Screening for mutants 

Starter cultures were set up by inoculating each 
colony in 1 ml SD medium and incubating overnight at 
30**C and 300 rpm. Cell densities were determined using 
a Coulter counter (Coulter Zl; Coulter electronics 
limited). 250.000 cells/ml were inoculated in 1 ml SD 
medium and cultures were incubated for 24 hours at 
30*0 and 300 rpm. Cultures were washed in minimal 
medium without glucose (S) and the pellet resuspended 
in 650 fil S medium. 8 ^1 of this culture is used for 
inoculating 400 ;zl cultures in a Honeywell-100 plate 
(Bioscreen analyzer; Labsystems) . Each transformant 
was grown during three days in S medium containing 
LiAc; pH 6.0, with 2% glucose/2% maltose or 2% 
galactose/ 2% maltose respectively while shaking every 
3 minutes for 20 seconds. Optical densities were 
measured every hour during three consecutive days and 
growth curves were generated (Bioscreen analyzer; 
Labsystems) . 

Growth curves of transf ormants grown in 
respectively ant i -sense non-inducing (glucose /maltose) 
and inducing (galactose/maltose) medium are compared 
and those transf ormants showing impaired growth upon 
anti-sense induction are selected for further 
analysis. Transf ormants showing impaired growth by 
virtue of integration into a critical gene are also 
selected. 

Isolation of genomic or cDNA inserts 

Putatively interesting transf ormants are grown in 
1.5 ml SD overnight and genomic DNA is isolated using 
the Nucleon MI Yeast kit (Clontech) . Concentration of 
genomic DNA is estimated by analyzing a sample on an 



:'ig 



agarose gel. 

20 ng of genomic DNA is digested for three hours 
with an enzyme that cuts uniquely in the library 
vector (Sad for the genomic library; PstI for the 
cDNA library) and treated with RNAseo Samples are 
phenol/chloroform extracted and precipitated using 
NaOAc/ethanol o 

The resulting pellet is resuspended in 500 //I 
ligation mixture (l x ligation buffer and 4 units of 
10 T4 DNA ligase; both from Boehringer) and incubated 
overnight at IS'^C. 

After denaturation (20 min 65*»C) , purification 
(phenol/chloroform extraction) and precipitation 
(NaOAc/ethanol) the pellet is resuspended in 10 /2I 
15 MilliQ (Millipore) water. 

^ca sinalysis 

Inverse PGR is performed on 1 ^1 of the 
precipitated ligation reaction using library vector 

20 specific primers (oligo23 5' TGC-AGC-TCG-ACC-TCG-ACT-G 
3' and oligo25 5' GCG-TGA-ATG-TAA-GCG-TGA-C 3' for the 
genomic library; 3pGALNistPCR primer 
s 5'TGAGCAGCTCGCCGTCGCGC 3' and SpGALNistPCR primer: 
5'GAGTTATACCCTGCAGCTCGAC 3' for the cDNA library; both 

25 from Eurogentec) for 30 cycles each consisting of (a) 
1 min at 95 »C, (b) 1 min at 57 ^C, and (c) 3 min at 
72 ®Co In the reaction mixture 2.5 units of Taq 
polymerase (Boehringer) with TaqStart antibody 
(Clontech) (1:1) were used, and the final 

3 0 concentrations were 0.2 pcK of each primer, 3 mM MgCl2 
(Perkin Elmer Cetus) and 200 /zM dNTPs (Perkin Elmer 
Cetus) • PGR was performed in a Robocycler 
(Stratagene) . 
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Sequence determination 

Resulting PGR products were purified using PGR 
purification kit (Qiagen) and were quantified by 
comparison of band intensity on EtBr stained agarose 
5 gel with the intensity of DNA marker bands. The amount 
of PGR product (expressed in ng) used in the 
sequencing reaction is calculated as the length of the 
PGR product in basepairs divided by 10. Sequencing 
reactions were performed using the ABI Prism BigDye 

10 Terminator Gycle Sequencing Ready Reaction Kit 

according to the instructions of the manufacturer (PE 
Applied Biosystems, Foster Gity, GA) except for the 
following modifications. 

The total reaction volume was reduced to 15 //I. 

15 Reaction volume of individual reagents were changed 

accordingly. 6.0 ixl Terminator Ready Reaction Mix was 
replaced by a mixture of 3.0 /zl Terminator Ready 
Reaction Mix + 3 . 0 a^I Half Term (GENPTU^ Limited, 
Brighton, UK) . After cycle sequencing, reaction 

2 0 mixtures were purified over Sephadex G50 columns 

prepared on Multiscreen HV opaque microtiter plates 
(Millipore, Molsheim, Fr) and were dried in a 
speedVac. Reaction products were resuspended in 3 /zl 
loading buffer. Following denaturation for 2 min at 

25 95**C, 1 //I of sample was applied on a 5% Long Ranger 
Gel (36 cm v/ell-to-read) prepared from Singel Packs 
according to the supplier's instructions (FMG 
BioProducts, Rockland, ME) . Samples were run for 7 
hours 2X run on a ABI 377XL DNA sequencer- Data 

30 collection version 2.0 and Sequence analysis version 
3-0 (for basecalling) software packages are from PE 
Applied Biosystems. Resulting sequence text files 
were copied onto a server for further analysis. 



Soqtience analysis 

Nucleotide sequences were imported in the 
VectorNTI software package (InforMax Inc, North 
Bethesda, MD, USA), and the vector and insert regions 
5 of the sequences were identified. Sequence similarity 
searches against public and commercial sequence 
databases were performed with the BLAST software 
package (Altschul et ale, 1990) version 1.4. Both the 
original nucleotide sequence and the six-frame 
10 conceptual translations of the insert region were used 
as query sequences • The used public databases were the 
EMBIi nucleotide sequence database (Stoesser et al., 
1998) , the SWISS-PROT protein sequence database and 
its supplement TrEMBL (Bairoch and Apweiler, 1998), 
15 and the ALCES Candida albicans sequence database 

(Stanford University, University of Minnesota) • The 
commercial sequence databases used were the LifeSeqt© 
human and PathoSeq™ microbial genomic databases 
(Incyte Pharmaceuticals Inc., Palo Alto, CA, USA), and 
2 0 the GENESEQ patent sequence database (Derwent, London, 
UK) . Three major results were obtained on the basis of 
the sequence similarity searches: function, novelty, 
and specificity • A putative function was deduced on 
the basis of the similarity with sequences with a 
25 known function, the novelty was based on the absence 
or presence of the sequences in public databases, and 
the specificity was based on the similarity with 
vertebrate homologues* 

30 Methods 

Blastx of the nucleic acid sequences against the 
appropriate protein databases: Swiss-Prot for clones 
of which the complete sequence is present in the 
public domain, and paorfp (PathoSeq^") for clones of 

35 which the complete sequences is not present in the 



public domain. 

The protein to which the translated nucleic acid 
sequence corresponds to is used as a starting point. 
The differences between this protein and our 
5 translated nucleic acid sequences are marked with a 

double line and annotated above the protein sequence. 
The following symbols are used: 

a one-letter amino acid code or the ambiguity 
code X is used if our translated nucleic acid sequence 
10 has another amino acid on a certain position, 

the stop codon sign *is used if our translated 
nucleic acid sequence has a stop codon on a certain 
position. 

The letters fs (frame shift) are used if a frame 
15 shift occurs in our translated nucleic acid sequence, 
and another reading frame is used, 

the words ambiguity or ambiguities are used if a 
part of our translated nucleic acid sequence is 
present in the proteins, but not visible in the 
20 alignments of the blast results. 

The phrase missing sequence is used if the 
translated nucleic acid sequence does not comprise 
that part of the protein, 

Blastx: compares the six-frame conceptual 
25 translation products of a nucleotide query sequence 
(both strands) against a protein sequence database. 

Screening for compounds modulating expression of 
polypeptides critical for growth and survival of C. 
30 albicans 

The method proposed is based on observations 
(Sandbaken et al . , 1990; Hinnebusch and Liebman 1991; 
Ribogene PCT WO 95/11969, 1995) suggesting that 
underexpression or over express ion of any component of 
35 a process (e.g. translation) could lead to altered 



15 



sensitivity to an inhibitor of a relevant step in that 
process. Such an inhibitor should be more potent 
against a cell limited by a deficiency in the 
macromolecule catalyzing that step and/or less potent 
5 against a cell containing an excess of that 

macromolecule, as compared to the wild type (WT) cell. 

Mutant yeast strains, for example, have shown 
that some steps of translation are sensitive to the 
stoichiometry of macromolecules involved. {Sandbaken 
10 et al.). such strains are more sensitive to compounds 
which specifically perturb translation (by acting on a 
component that participates in translation) but are 
equally sensitive to compounds with other mechanisms 
of action. 

This method thus not only provides a means to 
identify whether a test compound perturbs a certain 
process but also an indication of the site at which it 
exerts its effect. The component which is present in 
altered form or amount in a cell whose growth is 
affected by a test compound is potentially the site of 
action of the test compound. 

The assay to be set up involves measurement of 
growth of an isogenic strain which has been modified 
only in a certain specific allele, relative to a wild 
25 type (WT) C. albicans strain, in the presence of R- 
compounds. Strains can be ones in which the 
expression of a specific essential protein is impaired 
upon induction of anti-sense or strains which carry 
disruptions in an essential gene. An in silico 
30 approach to finding novel essential genes in C. 

albicans will be performed. A number of essential 
genes identified in this way will be disrupted (in one 
allele) and the resulting strains can be used for 
comparative growth screening. 

35 
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Assay for High Tlxrougbput screening for drugs 

35 pel minimal medium (S medium + 2% galactose + 
2% maltose) is transferred in a transparent flat- 
bottomed 96 well plate using an automated pipetting 
5 system (Multidrop, Labsystems) . A 96-channel pipettor 
(Hydra, Robbins Scientific) transfers 2 . 5 ^^1 of R- 
compound at 10"^ M in DMSO from a stock plate into the 
assay plate. 

The selected C. albicans strains (mutant and 
10 parent (CAI-4) strain) are stored as glycerol stocks 
(15%) at -70fiC. The strains are streaked out on 
selective plates (SD medium) and incubated for two 
days at 302C, For the parent strain, CAI-4, the medium 
is always supplemented with 20 //g/ml uridine. A single 
15 colony is scooped up and resuspended in 1 ml minimal 
medium (S medium + 2% galactose + 2% maltose) . Cells 
are incubated at 3 0 2C for 8 hours while shaking at 
250 rpm. A 10 ml culture is inoculated at 250.000 
cells/ml. Cultures are incubated at 302C for 24 
20 hours while shaking at 250 rpm. Cells are counted in 
Coulter counter and the final culture (S medium + 2% 
galactose + 2% maltose) is inoculated at 20.000 to 
50.000 cells/ml. Cultures are grown at 302C while 
shaking at 250 rpm until a final OD of 0.24 (+/- 0.04) 
25 6nM is reached. 

200 /il of this yeast suspension is added to all 
wells of MW9 6 plates containing R-compounds in a 4 50 
/il total volume. MW96 plates are incubated (static) at 
30fiC for 48 hours. 
30 Optical densities are measured after 48 hours. 

Test growth is expressed as a percentage of 
positive control growth for both mutant (x) and wild 
type (y) strains. The ratio (x/y) of these derived 
variables is calculated. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 

(A) NAME: Janssen Pharmaceutica 

(B) STREET: Turnhoutseweg 30 

(C) CITY: Beerse 

(E) COUNTRY: Belgium 

(F) POSTAL CODE (ZIP): B-2340 

(G) TELEPHONE: +32 (0)14/60.21.11 

(H) TELEFAX: +32 (0) 14/60.28.41 

(ii) TITLE OF INVENTION: DRUG TARGETS IN CANDIDA ALBICANS 
(iii) NUMBER OF SEQUENCES: 72 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1.30 (EPO) 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: GB 9817796,7 

(B) FILING DATE: 14 -AUG- 1998 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 255 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHET I CAL : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

AACGTTCGTG CAAAAGGCTA TACTGGTGAT ATCCACGCAG ATGAAGAGCA AGTTTAATCA 60 

ACTCTTTGTC AATTAATGCT GTACTTGTTT TCATTTTATT TGCTGGCATT TAAAGAATAC 120 

CCATAGTTCA GAAAATAAAA TTGAAAAATT TAAAAAAAAA CGCAATATCA TTCATTTTTT 180 

TTGTTTTTTT GACAATAATA TTAATATGTA GTTACCAATG TTTTTAGATT TTATATGTTT 24 0 

TGAAAAAATA GTTTG 255 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 648 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 




AACCTCTTAT TCGGTTCTAG TGTCTCT^TT GGTTATCCAT TAACATCTAT TCCCAACTCC 60 

ATCATTATTG GCAATAAATA AATGGGTGTT ATATCTATTG GTAATAACTA AACTGGTGTC 120 

AATTCAATTC CAATATGGTC ATGACAATTG AAAGTGTTAC TGTTCTGGTT TACATATTCT 180 

ACAGGTTACA ACTATTGATT GGTTAGAAGT TTGGTTTCAA CATCACCTGT TGCTAAGAAT 240 

AAATGTTGGT CATATCAATT GAATCATTTG TTGGTGTTAT GGTAAGTAAA TGCTGGTTAT 300 

ATCTATTATC TACAACCACC AAGTGATAAA TGCTGAACCG TAGTCACCAA CTGTTATGCT 360 

GGTTGTATCT ATTGACTAAA ACTACCCTAG GGATAAATGC TGAACCGTGG TTACCAACTG 420 

TTATGCTGGT TGTATCTATT AACTGCAACC ACCAAATGAT AAATGCTGAA CCATAATTAC 48 0 

CAACTGTTAC ATTGCTGGTA CTACATTAAG AATAAATGCT GCATCTACAA GTACCACCTG 540 

TTGTGTTAAT AAATGCTGCA CCTGCTAGTA CAACTGTTGC TGGTCATGAT AGTTACTACA 600 

CATTACACAC CAGACAGTGG CAAACAAGGT TATGTAGAAA CCAACGTT 648 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 904 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AACTGTCCTG TGAAGACGAA CATCACAACC ACAATCATGG TCATAACCAA AATCACAATC 60 

ATGTTGCTCC TATTCCTACA ACAGCTGGAC AATCATTAAA TAATAAAATT GATACATCTA 120 

AAGTGACAGC TCTCAACATG GCCAACTCTG CTGACGATCT AGCAAAAGTT TTCAAAGATT 180 

CGACTAAAAA ATATCAAATC AAACCAATTA TCAAATCAGA CAGTGATGAA CAAATGATTA 240 

TCAACATTCC ATTTCTTAAT GGTAGTGTCA AATTGTATTC GATAATTCTA CGTACCAATG 300 

GGGATTTGTA TTGTCCCAAA ACAATAAAAT TATTCAAAAA TGACACATCA ATTGATTTTG 3 60 

ATAATGTGGA TTCGAAGAAA CCAATACAGG TGTTAACTCA TCCTCAAGTT GGTGTTGCTA 420 

ATAATGATAG CGATGATCTT CCAGAGTTTT TGGAATCAAA TAACGATGAC GATTTTGTCG 4 80 

AACATTATGT GTCTCGACAT AAATTCACTG GGGTAAATCA ATTGACAATA TTTATTGAAG 540 

ATATTTATGA TGAAGGAGAA GAAGAGTGTC ATTTACATTC AATTGAATTG AGAGGGGAAT 600 

TCACTGAATT AAACAAAGAC CCTGTCATTA CATTATATGA ACTGGCTGCT AATCCTGCTG 660 

ATCATAAGAA TTTAACGATT GTTGAAAATC AAAATCTAGC ATAAAACAAA GAAGTGAAAG 720 

GTATCAGATA AGCTGGTTAC ATTACAATTG ATCTAATTTA GAATCTCAAG GTATTTA7VAT 780 

TTGCCGTTTT GCGATAATAT AACATGGTCA AGAACGTTGA ATCGATTACG TTAATGGTTT 840 

AGCTAATTGA TTTTTAGGAT CGAGTATTTA GAGTGAATAA ACAATAAACA AGAATGATGA 900 

ATTG 904 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 232 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknovm 

(ii) MOLECULE TYPE: peptide 
<iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Ser Cys Glu Asp Glu His His Asn His Asn His Gly His Asn Gin Asn 
1 5 . 10 IS 

His Asn His Val Ala Pro lie Pro Thr Thr Ala Gly Gin Ser Leu Asn 
20 25 30 

Asn Lys lie Asp Thr Ser Lys Val Thr Ala Leu Asn Met Ala Asn Ser 
35 40 45 

Ala Asp Asp Leu Ala Lys Val Phe Lys Asp Ser Thr Lys Lys Tyr Gin 
50 55 60 

lie Lys Pro lie lie Lys Ser Asp Ser Asp Glu Gin Met lie lie Asn 
65 70 75 80 

lie Pro Phe Leu Asn Gly Ser Val Lys Leu Tyr Ser lie lie Leu Arg 
85 90 95 

Thr Asn Gly Asp Leu Tyr Cys Pro Lys Thr lie Lys Leu Phe Lys Asn 
100 105 110 

Asp Thr Ser lie Asp Phe Asp Asn Val Asp Ser Lys Lys Pro lie Gin 
115 120 125 

Val Leu Thr His Pro Gin Val Gly Val Ala Asn Asn Asp Ser Asp Asp 
130 135 140 

Leu Pro Glu Phe Leu Glu Ser Asn Asn Asp Asp Asp Phe Val Glu His 
1^5 150 155 160 

Tyr Val Ser Arg His Lys Phe Thr Gly Val Asn Gin Leu Thr lie Phe 
165 170 175 

lie Glu Asp He Tyr Asp Glu Gly Glu Glu Glu Cys His Leu His Ser 
180 185 190 

He Glu Leu Arg Gly Glu Phe Thr Glu Leu Asn Lys Asp Pro Val He 
195 200 205 

Thr Leu Tyr Glu Ser Ala Ala Asn Pro Ala Asp His Lys Asn Leu Thr 
210 215 220 

He Val Glu Asn Gin Asn Leu Ala 
225 230 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 608 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 






(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
AACCTACAAA AGACTCACAT GTGCTGTACA ATAAATTTCT GGATAAGCAT ATAAGTGATG 
AGCAACTATC ACACTTACTC GACAATCATA AACCCAATCT AGTGACTACC ACAACTTTAA 
TTGATTCTAT CAAAGAAAGT GAACTGTTAT ATAATACCAT GGACAGTTTG ATGATAAAAT 
CCATCAATTT TCCTGCAGCC ATGTACCAGT CAAATGACAA CAATTCACAA TCACCAATCG 
AGTATTTATC TAACAGAGTA AAATTGCTCA CACAAGAGTT ATACGAAGAT TCAGTCAAAT 
ATGGCAAGTT TCTACAGAGT GGTAATAATC ATATATATCA ATTACGAAGT AGGATTTTAC 
AGACCTTTGA TCAGTTGTCA GAGAGTCACT ATTCTTTAAA TGAACTATAT AATAAAGACA 
TGTCTTACGC AGAAACATTA CACGGATCTT TCAAGAAATG GGATCAACAA AGAAATAAAG 
TATTGTCCAA AGTGAAATCT ATAAAAAGTG ATACAAGCAA ACATGGAGCC AAATTATTCA 
CCTTATTAGA TGAAGTTAAT GATGTTGATG ACGAGATCAA ACTTTTGGAA GCAAAACTAC 
AGCAGGTT 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1497 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



GATATCTGCA 


GAATTCGGCT 


TCTCTCTCAT 


CTTCACACAA 


TGCATTTTAC 


AAGTAGCCTA 


CTAGCCACCT 


TGATATGGTT 


TACATTACCG 


GTTCAAAGTT 


TGAATACTGA 


ATCTAGGACA 


ACTTCAAATA 


ACACAATATC 


AATACTTACA 


AACCATTTTC 


AAATACTAAA 


GGATTTGCTA 


CCATATAGCA 


AAACTTCTAA 


ACCGCAAATC 


AAGGAATCCA 


GACCGTTGAT 


TAAAGTTCTG 


AGAGATGGAG 


TGCCAATAAA 


TTTCCACAGG 


GCTCCGGCTA 


TAATAATGAA 


ATCGAACAAA 


ACAGACGATT 


TAGTCAGGAA 


TAGCAATAAA 


ACAATGGTGC 


TAACTGAAAT 


AAAAACGATT 


ACTGAATTTG 


CAACTACCAC 


TGTTTCCCCT 


ACACAAGAAT 


TTCAAGCACT 


ACAGATAAAC 


CTTAACACGT 


TATCAATAGA 


GACTTCAACA 


CCAACATTCC 


AATCCCATGA 


CTTTCCACCG 


ATTACCATTG 


AAGACACACC 


CAAAACACTA 


GAACCAGAAG 


AATCGTCAGA 


TGCTTTGCAG 


AGGGATGCAT 


TTGATCAAAT 


TAAGAAACTA 


GAAAAATTGG 


TATTGGATTT 


GAGACTTGAA 


ATGAAAGAGC 


AACAAAAGAG 


TTTCAACGAT 


CAATTAGTGG 


ATATATATAC 


CGCAAGAAGT 


ATTGTTCCAA 


TTTATACTAC 


ACATATCGTC 


ACTTCGGCGA 


TTCCATCGTA 


TGTACCAAAA 


GAAGAAGTAA 


TGGTTTCACA 


TGATACTGCA 


CCAATTGTAA 


GTCGTCCTAG 


AACAGATATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
608 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
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CCAGTATCTC AACGAATTGA TACTATCTCA AAACATAAAA TGAATGGAAA AAATATATTG 840 

AACAACAATC CTCCGCCCAA TTCAGTTTTA ATAGTTCCTC AGTTTCAGTT CCATGAAAGA 900 

ATGGCCACCA AAACCGAAGT AGCTTATATG AAACCAAAAA TTGTCTGGAC CAACTTTCCA 960 

ACCACTACTG CAACGTCAAT GTTTGACAAT TTTATTTTAA AAAATCTTGT TGACGAAACG 1020 

GATTCTGAAA TTGATAGTGG TGAAACTGAA TTGTCTGACG ATTATTATTA CTATTATAGT 1080 

TACGAAGATG ATGGT7VAAGA AGACGATAGT GATGAGATTA CGGCTCAAAT ACTATTATCC 1140 

AATTCAGAAT TAGGCACGAA GACGCCAAAT TTTGAGGATC CTTTTGAACA AATCAATATT 1200 

GAAGACAATA AAGTAATATC TGTTAATACA CCAAAGACAA AGAAACCTAC TACAACAGTA 1260 

TTTGGCACTT CTACTAGTGC ATTATCAACT TTTGAAAGTA CAATATTTGA AATTCCCAAA 1320 

TTCTTTTATG GTAGCAGAAG AAAACAACTG ASCTCATTCA AAAATAAGAA CAGTACAATC 1380 

AAATTTGATG TGTTTGATTG GATATTTGAA AGTGGTACTA CCAATGAGAA AGTACATGGA 144 0 

TTAGTGTTGG TGTCTAGTGG TGTTCTACTA GGAACTTGTC TATTGTTCAT TTTGTAG 1497 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 85 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met His Phe Thr Ser Ser Leu Leu Ala Thr Leu lie Trp Phe Thr Leu 
15 10 15 

Pro Val Gin Ser Leu Asn Thr Glu Ser Arg Thr Thr Ser Asn Asn Thr 
20 25 30 

lie Ser lie Leu Thr Asn His Phe Gin lie Leu Lys Asp Leu Leu Pro 
35 40 45 

Tyr Ser Lys Thr Ser Lys Pro Gin lie Lys Glu Ser Arg Pro Leu lie 
50 55 60 

Lys Val Ser Arg Asp Gly Val Pro lie Asn Phe His Arg Ala Pro Ala 
65 70 75 80 

lie lie Met Lys Ser Asn Lys Thr Asp Asp Leu Val Arg Asn Ser Asn 
85 90 95 

Lys Thr Met Val Leu Thr Glu He Lys Thr He Thr Glu Phe Ala Thr 
100 105 110 

Thr Thr Val Ser Pro Thr Gin Glu Phe Gin Ala Leu Gin He Asn Leu 
115 120 125 

Asn Thr Leu Ser He Glu Thr Ser Thr Pro Thr Phe Gin Ser His Asp 
130 135 140 

Phe Pro Pro He Thr He Glu Asp Thr Pro Lys Thr Leu Glu Pro Glu 
145 150 155 160 

Glu Ser Ser Asp Ala Leu Gin Arg Asp Ala Phe Asp Gin He Lys Lys 
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Leu Glu Ly3 Leu Val Leu Asp Leu Arg Leu Glu Met Lys Glu Gin Gin 
180 185 190 

Lys Ser Phe Asn Asp Gin Leu Val Asp lie Tyr Thr Ala Arg Ser lie 
195 200 205 

Val Pro lie Tyr Thr Thr His lie Val Thr Ser Ala lie Pro Ser Tyr 
210 215 220 

Val Pro Lys Glu Glu Val Met Val Ser His Asp Thr Ala Pro lie Val 
225 230 235 240 

Ser Arg Pro Arg Thr Asp lie Pro Val Ser Gin Arg He Asp Thr He 
245 250 255 

Ser Lys His Lys Met Asn Gly Lys Asn He Leu Asn Asn Asn Pro Pro 
260 - 265 270 

Pro Asn Ser Val Leu He Val Pro Gin Phe Gin Phe His Glu Arg Met 
275 280 285 

Ala Thr Lys Thr Glu Val Ala Tyr Met Lys Pro Lys He Val Trp Thr 
290 295 300 

Asn Phe Pro Thr Thr Thr Ala Thr Ser Met Phe Asp Asn Phe He Leu 
305 310 315 320 

Lys Asn Leu Val Asp Glu Thr Asp Ser Glu He Asp Ser Gly Glu Thr 
325 330 335 

Glu Leu Ser Asp Asp Tyr Tyr Tyr Tyr Tyr Ser Tyr Glu Asp Asp Gly 
340 345 350 

Lys Glu Asp Asp Ser Asp Glu He Thr Ala Gin He Leu Leu Ser Asn 
355 360 365 

Ser Glu Leu Gly Thr Lys Thr Pro Asn Phe Glu Asp Pro Phe Glu Gin 
370 375 380 

He Asn He Glu Asp Asn Lys Val He Ser Val Asn Thr Pro Lys Thr 
385 390 395 400 

Lys Lys Pro Thr Thr Thr Val Phe Gly Thr Ser Thr Ser Ala Leu Ser 
405 410 415 

Thr Phe Glu Ser Thr He Phe Glu He Pro Lys Phe Phe Tyr Gly Ser 
420 425 • 430 

Arg Arg Lys Gin Ser Ser Ser Phe Lys Asn Lys Asn Ser Thr He Lys 
435 440 445 

Phe Asp Val Phe Asp Trp He Phe Glu Ser Gly Thr Thr Asn Glu Lys 
450 455 460 

Val His Gly Leu Val Leu Val Ser Ser Gly Val Leu Leu Gly Thr Cys 
465 470 475 480 

Leu Leu Phe He Leu 
485 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1651 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GAGCTCTTCC AGAGGCAACA AGCGGAAGAA GCACAACGAA AGAAGGAATT TGAACAAAAG 60 

GCCGAATTCA TCAAAGCATC ATTACTTGAA ATGCGCCGAA GAGAAATAGA GAGGCGGAAA 120 

CAGCAAAAGG AAAGGGAACA AAGACAAT^ GAGCACGAAG CAAAGAGGGA TATCAGGATA 180 

CAACAACTTT CAGAGCAGGA TTCACGGAGT AATCAAACTA AAGAAGAAGA GGAAGTGTTC 24 0 

AAGAAGGCCC GGTCTACTAA TTCGGGAGCA GACGAGACTG GTTTGATGTC AGATAAAGAG 300 

TTTGATGATT CTGCATATTC ACCCGATTAT TTGTTTGAAG AGAATTTGTG GAATAAACCA 3 60 

AATCATCCAG ATACAAATCA TAAAACCAAA AAATATACTG AGAATGTGGT TGAAAATCTA 420 

GATTCTCCAC CAAATGATAC ATCTGCGTAC AATTCAAGTT TTCATGATGA AACTAATATT 480 

CAAAATGAGA TCCAAATACC AGAAAATGAC GAGTATGTAC CACAGATGAA AGCTACATCC 5 40 

AGTGTCAATA ATACCACCAT CCCTGCACAA AGAAGACATG AGTCACTTTC CACTTCTGAA 600 

AACAAAAGAA GGAAATTTGA AACAGCCGAC GTTGGGGTTG ATGGGTTAGA TTCCCCAGTG 660 

CGGGCACAAC CAGAAATATC TGGAAAATCC AAGTCTCCGA TAATCCCTGA TGTAATACTT 720 

TTACTGGACG AAGAGACTGA AACTCCTGAA GCAAATGCTG TGCAGGACAA TAGTACATAT 780 

ATTCCTCAGG GGTCTTTAGG ACACGAATTT AGAAATATTT TGGAAGAGCA TCCACGTCAA 84 0 

GTAAAGAATA AACAAAATTC TGGTGTTGCT TTTGCATTTC CGAATGCTTC CAAGAATACC 900 

GAAAACAAAC TCCACTCTAA TTTCAAAGAT AAAGATGAAG GAATAATTGA TGTTGAAGCT 960 

TACGTACCTG ATGTCAAAGC AGCAACTTCA AACACCACCC CAGCAACAGG ACAAACATCA 1020 

GCAAGGTCGG AAAAACTGCC ACCCTTACCT ACTCATATTC CAAATCCATC GACCATGAAT 1080 

GAAGCTCGAC CTCATCCAAC AACTCCACAT AAAAGATCAA AAGTCATTTT CGATTTAAAA 114 0 

GATTTAGAAC AAAAGTTAGG TAATGATATT GAGGATTTGG ATTTTAAGGA TATGTATGAG 1200 

AGTTTGCCTG ACCATTCAAG TAAGGCAACA CCTAAAGACG ATATTTTAAC CCGTTCTAAA 12 60 

AGAAGACTTT ATACATATAC CGATGGAACA TCAAAGGCTG AAACGTTATC TACACCAATG 1320 

AACAAAAATC CTGTTCGTGG ACATAGTACC AAGAAAAAGC TTAGTATGTT GGACATGCAT 1380 

GCGTCTTCTA AAATTCAAAG TCTTTTACCT CCACAACCGC CACAAATGTC AATTGATCCT 1440 

TCTGTTTCCA AGCAAGTGTG GGCTAAATAC GTTGATGCAA TCTTGACTTA TCAAAGAGAA 1500 

TTTTTCAATT ATAAAAAAGT GATTGTTCAA TACCAAATGG AACGGATAAA CAAAGACCTT 15 60 

GAACATTTTG ACGATATAAA TGATGGTTCA CACACTGAGA ATTTGGATAC TTTCAAGCAT 1620 

TGTTTAGAAC AAGATTATTT GGTTAGTTGA C 1651 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAI*: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
AACCTGTTGA CGCGTTGTCT TTTTCTACCC CACGTTTAAC AATCTTGCCA GTCAATTCAC 
TAGCCAAATA AACTTTAGAC TCACAACTCT AACACTGACT CGCCCCCCCC TGTTTAAACT 
CTAAATTACT TCACAGAGCC TTTACTACCT TAATTTAAGA TTATCTATTG TTTCTGTTCT 
TTTGCAATCA CCCTGACTCG tTTTTTTTTC AGCCAGTTTT TTCGTAAAAT CTGACCAAAA 
ATTTACAACT CTAATTTAAA ACTCTAAATA ACAATTAAAA CTCAATTCAG ACAAGTCCTT 
CTGCTCATTC TGAGTCTTCT CTATTGTCTT TTGACTTTTT GTGTGTGACT ATTTTCATGA 
TCACCCCGTT TCTTGCATTT TTTTCAGTCA ACTTTTTCTC AAAATCAAGC CAAT^AAAACA 
CATTTAACTG CCTATACAAC GCAAACCTAT TCAAAACAAG GTT 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 582 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AACCTCCCCG TTAACCACTT CTAGGTATAC CATTTCATCT GACTGAATAA CTGGTTAGTC 
GATTTGTTGT TGAAGAAAAG TGACCACCTA GTTTTTTCTG CCAACATTTT TTGCGATGAG 
CCGTCGACGC GTTGTCTTTT TCTACCCCAC GTTTAACAAT CTTGCCAGTC AATTCCCTAG 
CCAAATAAAC TTTAGACTCA CAACTCTAAC ACTGACTCGT GCCCCCCTGT TTAAACTCTA 
AATTACTTCA CAGAGCCTTT ACTACCTTAA TTTAAGATTA TCTATTGTTT CTGTTTTTTT 
GCAATCACCC TGACTCGTTT TTTTTTCAGC CAGTTTTTTC GTAAAATCTG ACCAAAAATT 
TACAACTCTA ATTTAAAACT CTAAATAACA ATTAAAACTC AATTCAGACA AGTCCTTCTG 
CTCATTCTGA GTCTTCTCTA TTGTCTTTTG ACTTTTTGTG TGTGACTATT TTCATGATCA 
CCCCGTTTCT TGCATTTTTT TCAGTCAACT TTTTCTCAAA ATCAAGCCAA AAAAACACAC 
CTTTAACTAC CTATACAACG CAAACCTATT CAAAACAAGG TT 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1066 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
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(ix) FEATURE: 

(A) NAME/KEY: inisc__f eature 

(B) LOCATION: 183 

(D) OTHER INFORMATION: /not e= "W = A or T" 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 5 64 

(D) OTHER INFORMATION: /note= "Y = C or T" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: H: 
AACCATAAAT ATGCCAAGAT TTAAACAAGT TGATGTATTC ACCAATGTCA AATATTTGGG 
TAATCCAGTT GCCGTTATTT ATGATAGTGA TAATTTAACC ACTCAAGAAA TGCAAAAAAT 
TGCTCGATGG ACAAATTTAT CAGAAACAAC ATTTATATTG ACTCCAAAAT CATCAATTGC 
TGWTTATAGT ATTAGAATTT TCACTTCTGG TGGGAATGAA TTACCATTTG CTGGTCATCC 
TACTTTAGGT ACTGCATTTG CATTATTGGA AGATGGTAAA ATAAAACCAA ATGACAATGG 
ACAAATAATT CAAGAATGTG GTGCTGGATT AGTGAAAATA TCCGTTGAAA AAACACCTAA 
TAATAATAGT AATGAGTTGC CGTTTTTGTT ATCTTTTGAA TTACCATATT TCAAATTTCA 
TGAAATTGAT GACAAAGTAA TCGAGGAATT ACAACATTCA TGGAATGGAA CCAATATTAT 
TGGTAAACCG GTACTTATTG ATGCTGGTCC AAAATGGGCA GTTTTCCAAC TTGGCTCCGG 
TAAAGAAGTA TTAGACTTGA ATGYTGATTT AGCACAAATT GAGAGATTAA GTTTAGAAAA 
TGGTTGGACA GGAATTGGTG TCTTTGGAAA ACATAATGAA AATGGTGATT CGGTCGAATT 
GAGAAATATT GCTCCTGCTG TTGGAGTCGC TGAAGATCCT GCTTGTGGAA GTGGATCAGG 
TGCTATTGGA GCATATTTGG CAAATCACGT TTTCAATGAA AAGGAAAAAT TTACAATTGA 
TATTTCTCAA GGTAAACCAA TTGAAAGAGA TGCTAAGATT CAAGTTAAAG TTAATCGTCT 
TACCACCAAA AATGGTGATT TATCTATTCA TGTTGGTGGT CATGCCATCA CTTGTTTCGA 
AGGTACTTAT TCTATTTAAA ACTTGATATA ATTCTTGAGT TATATCTAAT TTATCTAATT 
CACTTGTCCC TGGAGTAGTT TGATCTAATT GATGTAATTT ATTTAATAAA TCACGTTCTA 
AATCAGTTTG TTTAGATAAA TCATTTAATA AATCATCTTC AGCATT 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 302 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Pro Arg Phe Lys Gin Val Asp Val Phe Thr Asn Val Lys Tyr Leu 
15 10 15 

Gly Asn Pro Val Ala Val He Tyr Asp Ser Asp Asn Leu Thr Thr Gin 
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Glu Met Gin Lys lie Ala Arg Trp Thr Asn Leu Ser Glu Thr Thr Phe 
35 40 45 

lie Leu Thr Pro Lys Ser Ser lie Ala Xaa Tyr Ser lie Arg lie Phe 
50 55 60 

Thr Ser Gly Gly Asn Glu Leu Pro Phe Ala Gly His Pro Thr Leu Gly 
65 70 75 80 

Thr Ala Phe Ala Leu Leu Glu Asp Gly Lys lie Lys Pro Asn Asp Asn 
85 90 95 

Gly Gin lie lie Gin Glu Cys Gly Ala Gly Leu Val Lys lie Ser Val 
100 105 110 

Glu Lys Thr Pro Asn Asn Asn Ser Asn Glu Leu Pro Phe Leu Leu Ser 
115 120 125 

Phe Glu Leu Pro Tyr Phe Lys Phe His Glu lie Asp Asp Lys Val lie 
130 135 140 

Glu Glu Leu Gin His Ser Trp Asn Gly Thr Asn He lie Gly Lys Pro 
145 150 155 160 

Val Leu He Asp Ala Gly Pro Lys Trp Ala Val Phe Gin Leu Gly Ser 
165 170 175 

Gly Lys Glu Val Leu Asp Leu Asn Xaa Asp Leu Ala Gin He Glu Arg 
180 185 190 

Leu Ser Leu Glu Asn Gly Trp Thr Gly He Gly Val Phe Gly Lys His 
195 200 205 

Asn Glu Asn Gly Asp Ser Val Glu Leu Arg Asn He Ala Pro Ala Val 
210 215 220 

Gly Val Ala Glu Asp Pro Ala Cys Gly Ser Gly Ser Gly Ala He Gly 
225 230 235 240 

Ala Tyr Leu Ala Asn His Val Phe Asn Glu Lys Glu Lys Phe Thr He 
245 250 255 

Asp He Ser Gin Gly Lys Pro He Glu Arg Asp Ala Lys He Gin Val 
260 265 270 

Lys Val Asn Arg Leu Thr Thr Lys Asn Gly Asp Leu Ser He His Val 
275 280 285 

Gly Gly His Ala He Thr Cys Phe Glu Gly Thr Tyr Ser He 
290 295 300 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2829 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
ATGACGGAAA CTGTGATAGA AAAGAAAAGA AAGGTTGATT TAAATGCCTC AGGTATTACA 60 
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AAACAACCAA AAGCTTCTAA AATCTTCAGT CCATTCAGAG TTTTAGGG3W. TGTTACAGAC 
TCAACTCCTT TTGCCATGGG GACATTAGGT TCAACATTTT ATGCTGTCAC TTCTGTTGGC 

AGATCTTTCC AAATTTATGA CTTGGCTACA TTACATTTAT TGTTTGTTTC CCAAACTCAA 240 

ACTCCTTCAA GAATTACAAG TTTGGCTGCA CACCATCACT ATGTCTATGC ATCTTATGGT 300 

GATCGTATTG GTATTTTTAG ACGTGGTAGA TTAGAGCATG AATTGGTTTG TGAAGGGAAC 360 

TCTACAGTTA ACCAATTATT AGTATTTGGA GAATACCTTA TTGCTACCAC ATTAGAAGGT 420 

GATATTTTCG TATTTAGAAA AACTGAAGGA AAGAAATTCC CAACTGAATT ATACACTACA 480 

ATCAGAATAA TTAATTCTTT AGTTGAAGGA GAAATTGTGG GATTAATTCA TCCACCTACG 540 

TATTTAAATA AAGTAATTGT TGCTACTACT CAATCTGTGT TTGTTATAAA TGTGAGAACT 600 

GGCAAATTAT TATACAAATC CCGGGAATTA CAATTCGAAG GCGAAAAGAT TTCATCAATC 660 

GAAGCTGCTC CAGTTTTGGA TGTAATTGCT GTTGGTACAT CTAATGGAAA TGTATTTTTA 720 

TTCAACATTA AAAAGGGGAA AGTGTTGGGC CAAAAAATTA TTACTTCTGG AACTGAATCT 780 

TCTTCGAAAG TTGCCTCGAT CTCTTTTAGA ACAGATGGAG CACCTCATTT GGTTGCTGGT 840 

TTGAATAACG GGGACTTATA TTTCTACGAT TTAGACAAGA AATCACGTGT TCATGTTTTG 900 

AGAAATGCCC ATAAAGAGAC TCATGGGQGT GTTGCAAACG CCAAATTTTT GAATGGTCAA 960 

CCAATAGTAT TATCAAATGG TGGTGATAAT CATTTGAAAG AATTTGTTTT TGATCCTAAT 1020 

TTAACCACTT CGAATTCATC CATTGTTCCT CCTCCAAGAC ATCTCAGATC TAGAGGTGGG 1080 

CATTCAGCAC CACCAGTAGC TATTGAATTT CCTCAAGAAG ATAAAACCCA TTTTTTATTG 1140 

AGTGCTTCTA GAGATAAAAC ATTTTGGACA TTCTCTTTGA GAAAAGATGC TCAAGCACAG 1200 

GAAATGTCTC AAAGATTGCA AAAATCTAAG GATGGTAAAA GACAGGCTGG ACAAGTTGTT 1260 

TCTATGAGAG AGAAATTCCC AGAAATCATT TCCATTTCAT CCTCTTATGC CAGAGAAGGT 1320 

GATTGGGAAA ATATCATAAC CGCCCACAAG GATGAAACTT TTGCGAGAAC ATGGGATTCA 1380 

AGAAATAAAA GAGTCGGTAG ACATTTGTTA AACACTATTG ATGGTGGCAT TGTGAAATCT 1440 

GTATGTGTGT CTCAGTGTGG TAATTTTGGT TTAGTGGGAT CATCACTGGG TGGTATTGGA 1500 

TCATACAACC TTCAAAGTGG ATTGTTGCGT AAAAAATATG TTTTACATAA ACAAGCTGTC 1560 

ACCGGTTTAG CAATTGATGG AATGAATAGA AAAATGGTTA GTTGTGGTTT AGATGGAATT 1620 

GTGGGATTCT ATGATTTTGG AAAGTCTGTC TATTTAGGCA AATTACAACT TGAAGCACCT 1680 

ATAACATCCA TGATATATCA CAAACTGTCT GATCTTGTTG CTTGTGCCTT GGATGATTTG 174 0 

TCCATAGTTG TTATTGACGT GACTACTCAA AAAGTCATAA GAATATTATA TGGTCATACC 1800 

AACAGAATTT CAGGAATGGA TTTCTCGCCT GATGGGAGAT GGATAGTTTC AGTTGCATTG 1860 

GACTCCACTT TGCGAACTTG GGACTTGCCA ACTGGTGGTT GTATTGATGG GGTGATTTTA 1920 

CCAATTGTGG CAACTGCAGT TAAATTTTCT CCTATTGGTG ATATCTTAGC GACAACACAT 1980 

GTCTCTGGAA ATGGTGTATC CTTATGGACT AATCGTGCCC AGTTCAAGCC TGTGTCCACC 2040 

AGACACGTAG AAGAAGATGA GTTTTCAACT ATTTTATTAC CAAATGCTTC TGGAGATGGC 2100 

GGTTCAACAA TGCTAGACGG GTTTTTGGAC GAGGATTCTA ATGAAGACGG CACTATTGAT 2160 

GAACAGTATA CATCTGCTGC TCAAATTGAT GCATCCTTGA TTACTTTATC ATCAGAGCCA 2220 



AGATCAAAAT TCAACACTTT ATTGCATTTG GATACCATTA AACAACAAAG CAAACCGAAA 2280 

GAAGCACCTA AAAAACCAGA AAATGCACCT TTCTTTTTAC AATTGACTGG ACAAGCAGTT 2340 

GGTGATAGGG CATCGGTTGC TGAAGGCAAA ACTTCAGAAC AAACAAATAA CACTGTTGAA 2400 

GAAACCAACA GCAAATTGCG TAAATTGGAT ACAAACGGTA ACCACGCATT TGAAAGTGAA 24 60 

TTCACAAAAC TATTAAGGGA AGCTGGAGAG AGTGGACAAT TTGAAAGATT TTTGACTTAC 2520 

TTACTTAACT TATCTCCTGC TGTATTGGAC TTGGAAATTA GATCACTTAA TTCATTTGTT 2580 

CCATTGACTG AAATGACAAA TTTTATTCAA GCTTTAAATG CTGGTTTGAA ATCAAACGCA 2640 

AATTATGAAA TATGGGAAAC TTTATATGCC ATGTTTTTCA ACATACATGG TGATGTTATC 2700 

CATCAGTTTG AAAATGAAAC TAGTCTTCAT GAAGCTTTGG AAGAATACAG ACAGTTAAAT 2760 

GATGAAAAGA ATAACAAAAT GGATTCTTTA GTGAAATATT GTGCTAGTAT CGTAAGTTTT 2820 

ATTAGTTAG 2829 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 942 aniino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Thr Glu Thr Val lie Glu Lys Lys Arg Lys Val Asp Leu Asn Ala 
15 10 15 

Ser Gly lie Thr Lys Gin Pro Lys Ala Ser Lys lie Phe Ser Pro Phe 
20 25 30 

Axg Val Leu Gly Asn Val Thr Asp Ser Thr Pro Phe Ala Met Gly Thr 
35 40 45 

Leu Gly Ser Thr Phe Tyr Ala Val Thr Ser Val Gly Arg Ser Phe Gin 
50 55 60 

lie Tyr Asp Leu Ala Thr Leu His Leu Leu Phe Val Ser Gin Thr Gin 
65 70 75 80 

Thr Pro Ser Arg lie Thr Ser Leu Ala Ala His His His Tyr Val Tyr 
85 90 95 

Ala Ser Tyr Gly Asp Arg lie Gly lie Phe Arg Arg Gly Arg Leu Glu 
100 105 110 

His Glu Leu Val Cys Glu Gly Asn Ser Thr Val Asn Gin Leu Leu Val 
115 120 125 

Phe Gly Glu Tyr Leu lie Ala Thr Thr Leu Glu Gly Asp lie Phe Val 
130 135 140 

Phe Arg Lys Thr Glu Gly Lys Lys Phe Pro Thr Glu Leu Tyr Thr Thr 
145 150 155 160 

lie Arg lie lie Asn Ser Leu Val Glu Gly Glu lie Val Gly Leu lie 
165 170 175 



His Pro Pro Thr Tyr Leu Asn Lys Val He Val Ala Thr Thr Gin Ser 
180 185 190 

Val Phe Val He Asn Val Arg Thr Gly Lys Leu Leu Tyr Lys Ser Ara 
195 200 205 

Glu Leu Gin Phe Glu Gly Glu Lys He Ser Ser He Glu Ala Ala Pro 
210 215 220 

Val Leu Asp Val He Ala Val Gly Thr Ser Asn Gly Asn Val Phe Leu 
225 230 235 240 

Phe Asn He Lys Lys Gly Lys Val Leu Gly Gin Lys He He Thr Ser 
245 250 255 

Gly Thr Glu Ser Ser Ser Lys Val Ala Ser He Ser Phe Arg Thr Asp 
260 265 270 

Gly Ala Pro His Leu Val Ala Gly Leu Asn Asn Gly Asp Leu Tyr Phe 
275 280 285 

Tyr Asp Leu Asp Lys Lys Ser Arg Val His Val Leu Arg Asn Ala His 
290 295 300 

Lys Glu Thr His Gly Gly Val Ala Asn Ala Lys Phe Leu Asn Gly Gin 

310 315 320 

Pro He Val Leu Ser Asn Gly Gly Asp Asn His Leu Lys Glu Phe Val 
325 330 

Phe Asp Pro Asn Leu Thr Thr Ser Asn Ser Ser He Val Pro Pro Pro 
340 345 350 

Arg His Leu Arg Ser Arg Gly Gly His Ser Ala Pro Pro Val Ala He 
355 360 365 

Glu Phe Pro Gin Glu Asp Lys Thr His Phe Leu Leu Ser Ala Ser Arg 
370 375 380 

Asp Lys Thr Phe Trp Thr Phe Ser Leu Arg Lys Asp Ala Gin Ala Gin 

390 395 400 

Glu Met Ser Gin Arg Leu Gin Lys Ser Lys Asp Gly Lys Arg Gin Ala 
405 410 415 

Gly Gin Val Val Ser Met Arg Glu Lys Phe Pro Glu He He Ser He 
420 425 430 

Ser Ser Ser Tyr Ala Arg Glu Gly Asp Trp Glu Asn He He Thr Ala 
435 440 445 

His Lys Asp Glu Thr Phe Ala Arg Thr Trp Asp Ser Arg Asn Lys Arg 
450 455 460 y 

Val Gly Arg His Leu Leu Asn Thr He Asp Gly Giy He Val Lys Ser 
465 470 475 480 

Val Cys Val Ser Gin Cys Gly Asn Phe Gly Leu Val Gly Ser Ser Ser 
485 490 495 

Gly Gly He Gly Ser Tyr Asn Leu Gin Ser Gly Leu Leu Arg Lys Lvs 
500 505 510 

Tyr Val Leu His Lys Gin Ala Val Thr Gly Leu Ala He Asp Gly Met 
515 520 525 

Asn Arg Lys Met Val Ser Cys Gly Leu Asp Gly He Val Gly Phe Tvr 
530 535 540 

Asp Phe Gly Lys Ser Val Tyr Leu Gly Lys Leu Gin Leu Glu Ala Pro 
545 550 555 
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He Thr Ser Met He Tyx His Lys Ser Ser Asp Leu Val Ala Cys Ala 
565 570 575 

Leu Asp Asp Leu Ser He Val Val He Asp Val Thr Thr Gin Lys Val 
580 585 590 

He Arg He Leu Tyr Gly His Thr Asn Arg He Ser Gly Met Asp Phe 
595 600 605 

Ser Pro Asp Gly Arg Trp He Val Ser Val Ala Leu Asp Ser Thr Leu 
610 615 620 

Arg Thr Trp Asp Leu Pro Thr Gly Gly Cys He Asp Gly Val He Leu 
625 630 635 640 

Pro He Val Ala Thr Ala Val Lys Phe Ser Pro He Gly Asp He Leu 
645 650 655 

Ala Thr Thr His Val Ser Gly Asn Gly .Val Ser Leu Trp Thr Asn Arg 
660 665 670 

Ala Gin Phe Lys Pro Val Ser Thr Arg His Val Glu Glu Asp Glu Phe 
675 680 685 

Ser Thr He Leu Leu Pro Asn Ala Ser Gly Asp Gly Gly Ser Thr Met 
690 695 700 

Leu Asp Gly Phe Leu Asp Glu Asp Ser Asn Glu Asp Gly Thr He Asp 
705 710 715 720 

Glu Gin Tyr Thr Ser Ala Ala Gin He Asp Ala Ser Leu He Thr Leu 
725 730 735 

Ser Ser Glu Pro Arg Ser Lys Phe Asn Thr Leu Leu His Leu Asp Thr 
740 745 750 

He Lys Gin Gin Ser Lys Pro Lys Glu Ala Pro Lys Lys Pro Glu Asn 
755 760 765 

Ala Pro Phe Phe Leu Gin Leu Thr Gly Gin Ala Val Gly Asp Arg Ala 
770 775 780 

Ser Val Ala Glu Gly Lys Thr Ser Glu Gin Thr Asn Asn Thr Val Glu 
785 790 795 800 

Glu Thr Asn Ser Lys Leu Arg Lys Leu Asp Thr Asn Gly Asn His Ala 
805 810 815 

Phe Glu Ser Glu Phe Thr Lys Leu Leu Arg Glu Ala Gly Glu Ser Gly 
820 825 830 

Gin Phe Glu Arg Phe Leu Thr Tyr Leu Leu Asn Leu Ser Pro Ala Val 
835 840 845 

Leu Asp Leu Glu He Arg Ser Leu Asn Ser Phe Val Pro Leu Thr Glu 
850 855 860 

Met Thr Asn Phe He Gin Ala Leu Asn Ala Gly Leu Lys Ser Asn Ala 
865 870 875 880 

Asn Tyr Glu He Trp Glu Thr Leu Tyr Ala Met Phe Phe Asn He His 
885 890 895 

Gly Asp Val He His Gin Phe Glu Asn Glu Thr Ser Leu His Glu Ala 
900 905 910 

Leu Glu Glu Tyr Arg Gin Leu Asn Asp Glu Lys Asn Asn Lys Met Asp 
915 920 925 

Ser Leu Val Lys Tyr Cys Ala Ser He Val Ser Phe He Ser 
930 935 940 



180 
240 



480 
540 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 725 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
AACCTGGCAA TTAACTGCCC GGCAAGTGAT AGCAGGAGAT AGGTGTGTAT AGATTATAAT 60 
GGAACGCCGA TTTTTGCAGT ATCACGCGTA ATAAGGACAG CAGTTGGACA TCGGTACATG 120 
AGAGAGCAAT GTAAGTCTTG ATAGTAATGA GCCGTGTTGA AGTAGTATTT TAATCTAATT 
TTACTCAAAA AAGGACAATG GAGATCTGGA GATAACAGCA CACTAATCGG TTCTAGACAT 
AGACTAAGCC TGAAAGGGGG TACTACAGCT TGTTTTGAAA AGGTTTGCGT TGTATAGGCA 300 
GTTAAATGTG TGTTTTTTTT GGGTAGAATT TGAGAAAAAG TTGACTGAAA AAAATGCAAG 360 
AAACGGGGTG ATCATGAAAA TAGACACACA CT^AAAAGTCA AAAAACAATG GAAAAGCTTC 420 
AGAATAAGCA GTAGGAGGTG TCTGAATTGA GTTTGTATTG TTATTTAGAG TTTTAAATTA 
GAGTTGTAAA TTTTTGGGTA GAATTTACGA AAAAGTCGAA CAAAAAAACG ACAAGTCAGG 
GTGATTGCAA AAAAACAGAA ACAATAGATA ATCTTAAATT AAGGTAGTAG AGGCTCTGTG 600 
AAGTAATTTA GAGTTTAAAC AGGGGGGCAC GAGTCAGTGT TAGAGTTGTG AAGTTTATTT 660 
GGCTAGTGAA TTGACTGGCA AGATTGTTAA ACGTGGGGTA GAAAAAGACA ACGCATCGAC 720 
AGGTT 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1144 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CCATGATATA GAAATTGGTG GGTCAACGTA CTATCAAATT AACATAAAAC TACCACTTCG 60 

GTCATTCACG ATAAAGAAAC GGTACCTGGA ATTCCAGCAA TTGGTGCTGG ACTTGAGTCG 120 

TAATCTAGGC ATTGATAGTC GAGATTTTCC ATATGAATTA CCTGGGAAAC GGATCAACTG 180 

GCTTAACAAG ACCAGTATTG TTGAGGAGAG AAAAGTGGGA CTTGCAGAAT TTCTCAATAA 240 

CCTCATTCAA GACTCAACAC TTCAGAATGA ACGAGAAGTG TTGTCGTTTT TGCAATTGCC 300 

GTCTAATTTT AGATTCACCA AGGATATGTT ACAGAATAAT CGAGCAGACT TGGATTCTGT 360 



725 



GCAAAATAAC 


TGGTACGATG 


TATATCGTAA 


GTTGAAACTG 


GATATACTCA 


ACGAATCGTC 


420 


TAGCAGCATT 


AGTGAACAGA 


TACATATTCG 


TGATCGCATT 


AGTCGGGTCT 


ACCAACCACG 


480 


GATTCTCGAC 


TTGGTCAGGG 


CTATTGGTAC 


AGATAAAGAA 


GAGGCCCTAA AGAAGAAGCA 


540 


GTTGGTTTCC 


CAATTACAAG 


AGAGTATAGA 


TAATTTGTTA 


GTACAGGAAG 


TTCCCCGATC 


600 


AAAGAGGGTG 


TTGGGTGGAG 


CAGTTAAGGA 


AACGCCAGAG 


ACATTACCAT 


TAAACAATAA 


660 


AGAACTTCTT 


CAACACCAAG 


TACAAATTCA 


TCAAAACCAA 


GACAAAGAAC 


TAGACCAGCT 


720 


TAGGGTGTTA 


ATTGCCCGGC 


AGA7ACAGAT 


TGGCGAGCTA 


ATTAATGCAG 


AAGTAGAGGA 


780 


ACAGAATGAA 


ATGTTGGATA 


GGTTTAATGA 


AGAGGTCGAC 


TACACGTCCA 


GCAAAATCAA 


840 


GCAAGCAAGA 


CGCAGAGCTA 


AGAAGATATT 


ATAGTAATTT 


GTTCGCTACT 


TCGATATTAT 


900 


CTGCCATTGA 


CGTTATTCTT 


GCAGGTTGGC 


CCAATTGTTC 


GTTTGAAAGT 


TTTTCGAGGT 


960 


CTTCAGCGTC 


TAATGCCCTA 


TCTGAGCTCT 


CGCCATCGAG 


TTTCCAAAAC 


CCGCCGATAT 


1020 


TTTGAAAGAA 


TCTTTGAATG 


CCAAACCGTC 


GTGGCGGGAA 


CGATCTGCCT 


GCGTTGGCCA 


1080 


AGTTGAATAT 


GCTAGGGTGG 


TACTGTAAAT 


AGAAGACAGA 


TCCAATAAAC 


GTTCCTATAA 


1140 


ATGC 












1144 


(2) INFORMATION FOR SEQ ID NO: 17: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 290 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

His Asp lie Glu lie Gly Gly Ser Thr Tyr Tyr Gin lie Asn lie Lys 
15 10 15 

Leu Pro Leu Arg Ser Phe Thr lie Lys Lys Arg Tyr Ser Glu Phe Gin 
20 25 30 

Gin Leu Val Ser Asp Leu Ser Arg Asn Leu Gly lie Asp Ser Arg Asp 
35 40 - 45 

Phe Pro Tyr Glu Leu Pro Gly Lys Arg lie Asn Trp Leu Asn Lys Thr 
50 55 60 

Ser He Val Glu Glu Arg Lys Val Gly Leu Ala Glu Phe Leu Asn Asn 
65 70 75 80 

Leu He Gin Asp Ser Thr Leu Gin Asn Glu Arg Glu Val Leu Ser Phe 
85 90 95 

Leu Gin Leu Pro Ser Asn Phe Arg Phe Thr Lys Asp Met Leu Gin Asn 
100 105 110 

Asn Arg Ala Asp Leu Asp Ser Val Gin Asn Asn Trp Tyr Asp Val Tyr 
115 120 125 

Arg Lys Leu Lys Ser Asp He Leu Asn Glu Ser Ser Ser Ser He Ser 
130 135 140 



Glu Gin He His He Arg Asp Arg He Ser Arg Val Tyr Gin Pro Arg 

150 

He Leu Asp Leu Val Arg Ala He Gly Thr Asp Lys Glu Glu Ala Leu 

3^75 

Lys Lys Lys Gin Leu Val Ser Gin Leu Gin Glu Ser He Asp Asn Leu 

185 190 

Leu Val Gin Glu Val Pro Arg Ser Lys Arg Val Leu Gly Gly Ala Val 

200 205 

Lys Glu Thr Pro Glu Thr Leu Pro Leu Asn Asn Lys Glu Leu Leu Gin 
"^^^ 215 220 

His Gin Val Gin He His Gin Asn Gin Asp Lys Glu Leu Asp Gin Leu 
^"^^ 230 235 240 

Arg Val Leu He Ala Arg Gin Lys Gin He Gly Glu Leu He Asn Ala 

250 255 

Glu Val Glu Glu Gin Asn Glu Met Leu Asp Arg Phe Asn Glu Glu Val 

265 270 

Asp Tyr Thr Ser Ser Lys He Lys Gin Ala Arg Arg Arg Ala Lys Lys 



275 280 

He Leu 
290 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2736 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



285 



(ix) FEATURE: 

(A) NAME/KEY: raise feature 

(B) LOCATION: 11 

(D) OTHER INFORMATION: /note= "N = G or A or T or C" 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 2723, 72724 

(D) OTHER INFORMATION: /note= "N = A or T or C or G" 
(ix) FEATURE: 

(A) NAME/KEY: iaisc_f eature 

(B) LOCATION: 2714. .2715 

(D) OTHER INFORMATION: /not e= "N - A or T or C or G" 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: 2710 

(D) OTHER INFORMATION: /note- "N = A or T or C or G" 
(ix) FEATTOE; 

(A) NAME/KEY: mi3C__f eature 

(B) LOCATION: 270 6. .2707 

(D) OTHER INFORMATION: /not e= "N - A or T or C or G" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
ATGGAAAAAA NTTTGGCGAG TGTAAAGTTG TACACCGATT TGGAGTGTGT 






AACTATCCAA CAAGAATTGT TTGGGGTGCT TCTTACAATT TTGGAATTCA ACAGATGATG 
GCAAACTTTG ATCGGTTTTC AAAACCACCA GTGGATCCAT CTACAAAATT AGGATTTTGG 
GATAAGTTAA AGTATATCTT ACATGGTAAA TGCCAAATCA GAACTAGGAA AAGTTTAGAA 
GTTGCATTTA AAGGATCAAG AGATCCGTAT GATTTGTTCA CGACTGCAGG CGGGTTTGTA 
TTGTCATTTA GAAAGAATGT TGTCTGGGAC ATCAATAAAG ACGATAATTC GAAAAATTAC 
TTCGATATCA CGGCAGATAA AGTTTCCTGG TATATTCCAA ACTATTTAGC AGGACCATTA 
TTGGCTTGGA CAAGAAGTAG TAAAAATTCA ATTTATTTAC CAAATTCACC AAATGTGGTT 
AATTCTTGCT TTGCATATTA CCTTCAAGAT TTTACTGGAC AAGCTGATTT TGATCATGCT 
GCCCGAGTAT TTGAAAGAAA TGTGGTCAAT CTTAGTGGAG GAATTCATTT TCAAGTTGGG 
TTTCTACTTG AACGTAAAGA TACAAATGGT AAGAGAACCG ATGAATTCAA ACCTCATTAC 
GAAGTGCAGT TGTTTGATCC CAAGTATTGT GAGAAAGGAC ATGACTCTTA TGCTGGGTTC 
CGAAGTCAAT TTATACATAT GGCTATCTCA TTGGAATCAA CAAACAGTTC AAGTTATAAT 
ACAATCCATC TTAGTCCTGG TACTTTCCAA CAGTTTTTCG ATTGGTGGAA GTTATTTGCT 
AGTAATATGC AGTTACCTAT TAGACGTGGC AAAATGTTTG GAGAAGCAAA AGAATCTGTC 
AAGTTTTCGC AACATTTATT CACAAACAAG TTTTCTTTCA TGTTGAAATC TTTGTTTATT 
GCTCATGTTT ATCGAGACGA AATTGTTGAT ATCAATAACG ATAGAATAGA AAGTATTGGT 
TTAAGAGCCA AAGTAGATGA TTTTATGGTT GATTTACATC AAAGAAAAGA GCCAGCAACC 
CTTTACCATG AAGAATTATC TAAGAATGAG AAGGTGATGA AAATGAATTT TGATTTAGGA 
GAAGTCGTTT TATCAGGAAT AGACTTACGT GTCATGCATG TTTCATTTCT CCAAAATTTA 
TACACTCAAT CACATTCCAA TTCAGGTGAC GCTAAATCAA CTTATAATAT TTACGACAAT 
GATCATCGAT GGTTTGATAT TATGGATTTC CAAGAGGCAT TTTTGACATC AATTAAGGAT 
TGTGTCAGGA CAGTTGATAT TTATCCATTG ATGTATTTAC AAAGATTCTT TTATGAAAGA 
GATACACATG GTGGCAAGTC TGAGGATGAG ACTGCATTTG GAAAAGAAGT TATTCATAAA 
TGTAATTTGG GTGCCATGAA TCCCTTGGAA ACAAGATTGA ATGTATTGGT TCAAAGACTT 
AACGCTCTAC AAGAACAAGT CAAAAAATTG TCCAAAACAT CTGCTCCAGA ACCTGTAGCA 
GATTTGAAAA AACGAATTCT GTTTTTGCAA AAAGAGATTA GCACAACCAA AGCTGGCGTT 
AAGTCGAAAA TGCGTCGTAC ATCCACTATA AATGGTATGA ATAATTCTGA AAATTACCAC 
AATAAGTTTA CTTTCTATAA CATGCTTCTT AAATGGAATT TCAATTGTCG GAATTTGACA 
TTGAAATACA TACATTTTGT GAAATTGAAA TCACAACTTC GAAATTACTT GTCACACAAG 
TCCATTGAAA CACTTGAAAA AATGATGGAT AGTGTAAATG CATACAACGA TAAGGACGAT 
TTGTCATCGA CGTCAGAAAT AATCCGTCGT TTCACACTGG AAGGGGTTAA ATCACAGACA 
TCTACCAGCA AAGATATCAC TTCACAACAG AAACTTGACA ATTTCAACAC AATATTACGA 
GAGACCAGAC CAGACGAAAA AGTGGTTGAG GATTATTTGA TTGACGTGAT CGCACCTCAA 
ATTCAATTAC AAAGTGAGGA TTATCCTGAT TCTGTTGTGC TCATCTCTAC ACCATCTATT 
AAAGGTAAAA TTTTGTCCAT TAGGGATTCC AGGAATAATG CAAACCAAAT CTTGTTAGAA 
ACTAGGTATG GTATTTTACT AAAAGATGCC AATGTTTTTG TATTAAACAA AGAGGATATT 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 



15 



Val Phe Asn Ser Asn Tyr Pro Thr Arg He Val Trp Gly Ala Ser Tyr 
2^ 25 30 

Asn Phe Gly He Gin Gin Met Met Ala Asn Phe Asp Arg Phe Ser Lys 
35 40 45 

Pro Pro Val Asp Pro Ser Thr Lys Leu Gly Phe Trp Asp Lys Leu Lys 

55 go 

Tyr He Leu His Gly Lys Cys Gin He Arg Thr Arg Lys Ser Leu Glu 
" 70 75 QQ 

Val Ala Phe Lys Gly Ser Arg Asp Pro Tyr Asp Leu Phe Thr Thr Ala 
85 90 95 

Gly Gly Phe Val Leu Ser Phe Arg Lys Asn Val Val Trp Asp He Asn 
100 105 110 

Lys Asp Asp Asn Ser Lys Asn Tyr Phe Asp He Thr Ala Asp Lys Val 
115 120 125 

Ser Trp Tyr He Pro Asn Tyr Leu Ala Gly Pro Leu Leu Ala Trp Thr 
i30 135 

Arg Ser Ser Lys Asn Ser He Tyr Leu Pro Asn Ser Pro Asn Val Val 

150 155 160 

Asn Ser Cys Phe Ala Tyr Tyr Leu Gin Asp Phe Thr Gly Gin Ala Asp 
1^5 170 175 

Phe Asp His Ala Ala Arg Val Phe Glu Arg Asn Val Val Asn Leu Ser 
1®0 185 190 



2460 
2520 



2736 



GTAGGGTGTC CAGATATGTT AAGTATTAGT AATCCATATG GAGCTAAATC TAATTGGCCA 2280 
CCATGGCTAG GAACAGAAAT AACCCAAAAT GGTAAATGGG CTGGAGCCAA CAACTTATTG 2340 
ATTGAAAAGC TTTCTGTTAT GACAATGTGT TATGAAAGTG AAATTTTGTC AAGCAAGCTT 2 4 00 

TCTCCAAATG CACAAGATCT GGATCAAGAA GAGCAAGAAA ATTACAATGA TGATAATTCG 
AAACAGGCTC CTCTTCGACT TGGTATTGAT ATGCCTTCTG TGGTGATTAC ATCTACATCA 
AGTCAATACT TTACCTTATA TGTTATCATA GTGAGCTTGT TGTTTTATAG CGAGCCTATG 2580 
AGTAAAGTGA TCCACAAGAA AATCGAAAAG ATGAAGTTTT CTATTGATTT CGAAGATTTG 2640 
GGTGCTCTTA CTAGCAGATT AACGAAAATG CAGCAACATC ATAAATTGTT GAAAGTATTG 2700 
TCTAANNACN AATNNTTTCC CGNNCGGGGG AATTAA 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 911 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknovm 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Glu Lys Xaa Leu Ala Ser Val Lys Leu Tyr Thr Asp Leu Glu Cys 
1 5 10 — 



:-DESC:; 




Gly Gly lie His Phe Gin Val Gly Phe Leu Leu Glu Arg Lys Asp Thr 
195 200 205 

Asn Gly Lys Arg Thr Asp Glu Phe Lys Pro His Tyr Glu Val Gin Leu 
210 215 220 

Phe Asp Pro Lys Tyr Cys Glu Lys Gly His Asp Ser Tyr Ala Gly Phe 
225 230 235 240 

Arg Ser Gin Phe lie His Met Ala lie Ser Leu Glu Ser Thr Asn Ser 
245 250 255 

Ser Ser Tyr Asn Thr He His Leu Ser Pro Gly Thr Phe Gin Gin Phe 
260 265 270 

Phe Asp Trp Trp Lys Leu Phe Ala Ser Asn Met Gin Leu Pro lie Arg 
275 280 285 

Arg Gly Lys Met Phe Gly Glu Ala Lys Glu Ser Val Lys Phe Ser Gin 
290 295 300 

His Leu Phe Thr Asn Lys Phe Ser Phe Met Leu Lys Ser Leu Phe He 
305 310 315 320 

Ala His Val Tyr Arg Asp Glu lie Val Asp He Asn Asn Asp Arg He 
325 330 335 

Glu Ser He Gly Leu Arg Ala Lys Val Asp Asp Phe Met Val Asp Leu 
340 345 350 

His Gin Arg Lys Glu Pro Ala Thr Leu Tyr His Glu Glu Leu Ser Lys 
355 360 365 

Asn Glu Lys Val Met Lys Met Asn Phe Asp Leu Gly Glu Val Val Leu 
370 375 380 

Ser Gly He Asp Leu Arg Val Met His Val Ser Phe Leu Gin Asn Leu 
385 390 395 400 

Tyr Thr Gin Ser His Ser Asn Ser Gly Asp Ala Lys Ser Thr Tyr Asn 
405 410 415 

He Tyr Asp Asn Asp His Arg Trp Phe Asp He Met Asp Phe Gin Glu 
420 425 430 

Ala Phe Leu Thr Ser He Lys Asp Cys Val Arg Thr Val Asp He Tyr 
435 440 445 

Pro Leu Met Tyr Leu Gin Arg Phe Phe Tyr Glu Arg Asp Thr His Gly 
450 455 460 

Gly Lys Ser Glu Asp Glu Thr Ala Phe Gly Lys Glu Val He His Lys 
465 470 475 480 

Cys Asn Leu Gly Ala Met Asn Pro Leu Glu Thr Arg Leu Asn Val Leu 
485 490 495 

Val Gin Arg Leu Asn Ala Leu Gin Glu Gin Val Lys Lys Leu Ser Lys 
500 505 510 

Thr Ser Ala Pro Glu Pro Val Ala Asp Leu Lys Lys Arg He Ser Phe 
515 520 525 

Leu Gin Lys Glu He Ser Thr Thr Lys Ala Gly Val Lys Ser Lys Met 
530 535 540 

Axg Arg Thr Ser Thr He Asn Gly Met Asn Asn Ser Glu Asn Tyr His 
545 550 555 560 

Asn Lys Phe Thr Phe Tyr Asn Met Leu Leu Lys Trp Asn Phe Asn Cys 



5 65 



570 



575 



Arg Asn Leu Thr Leu Lys Tyr lie His Phe Val Lys Leu Lys Ser Gin 
580 585 590 

Leu Arg Asn Tyr Leu Ser His Lys Ser lie Glu Thr Leu Glu Lys Met 
595 600 605 

Met Asp Ser Val Asn Ala Tyr Asn Asp Lys Asp Asp Leu Ser Ser Thr 

615 620 

Ser Glu He He Arg Arg Phe Thr Ser Glu Gly Val Lys Ser Gin Thr 

630 635 640 

Ser Thr Ser Lys Asp He Thr Ser Gin Gin Lys Leu Asp Asn Phe Asn 
645 650 655 

Thr He Leu Arg Glu Thr Arg Pro Asp Glu Lys Val Val Glu Asp Tvr 
660 665 670 

Leu He Asp Val He Ala Pro Gin He Gin Leu Gin Ser Glu Asp Tvr 
675 680 685 

Pro Asp Ser Val Val Leu He Ser Thr Pro Ser He Lys Gly Lys He 
690 695 700 

Leu Ser He Arg Asp Ser Arg Asn Asn Ala Asn Gin He Leu Leu Glu 

710 715 720 

Thr Arg Tyr Gly He Leu Leu Lys Asp Ala Asn Val Phe Val Leu Asn 
725 730 735 

Lys Glu Asp He Val Gly Cys Pro Asp Met Leu Ser He Ser Asn Pro 
740 745 

Tyr Gly Ala Lys Ser Asn Trp Pro Pro Trp Leu Gly Thr Glu He Thr 
755 760 765 

Gin Asn Gly Lys Trp Ala Gly Ala Asn Asn Leu Leu He Glu Lys Leu 
770 775 780 

Ser Val Met Thr Met Cys Tyr Glu Ser Glu He Leu Ser Ser Lys Leu 
7^5 790 795 800 

Ser Pro Asn Ala Gin Asp Ser Asp Gin Glu Glu Gin Glu Asn Tyr Asn 
805 810 815 

Asp Asp Asn Ser Lys Gin Ala Pro Leu Arg Leu Gly He Asp Met Pro 
820 825 830 

Ser Val Val He Thr Ser Thr Ser Ser Gin Tyr Phe Thr Leu Tyr Val 
835 840 845 

He He Val Ser Leu Leu Phe Tyr Ser Glu Pro Met Ser Lys Val He 
850 855 860 

His Lys Lys He Glu Lys Met Lys Phe Ser He Asp Phe Glu Asp Leu 
®65 870 875 880 

Gly Ala Leu Thr Ser Arg Leu Thr Lys Met Gin Gin His His Lys Leu 
885 890 895 

Leu Lys Val Leu Ser Xaa Xaa Xaa Xaa Phe Pro Xaa Arg Gly Asn 
900 905 910 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 626 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 






(iii) HYPOTHETICAI.: NO 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 20r 
ATTCTTTGTT TGTTTGTTGA TTTTTGATCT CTTGTCTAGA ATCACTCATT AATATTTGAT 
TCAGGGTTTT GATTTGCTAA ATAAGGGGTC TATTAGGAGG ATATTATATA TAATGTGATG 
TGGCGAAAAA AAAAAACAAG ATCTACTACT CTGTTGGATT TATTTGTGAT GGCGATTGAA 
GAGAAAACAC GTCTTTTTAA CGCGTTTTTT TATTTTTTGG AGAAGCAAAT TTCAAGCAAA 
GACTCTTATT GTGTTGCTTT TGATCCATTC AAATTTTGTA TTACTTTTCA TTAGAACTAT 
AACTGTTCAT TATCAATGAC GTATACATGT CTGGTTCCTG TTATGTATTG TAATTTTAGT 
TAATTATAAG CCGTATATTG GTAGTATTCC TCTGTACTCA CAATGGAATT GGTCTTTCAA 
CAGCAACAAG TGTTATTTTC CCTGAATGTA GAAAATGAAA GGTAGTGTTT ACATATAGTT 
GGAAATCAAG CCTCTGAAAT GAATCACAAT ATAATAACAA TTTGTAGTTG CAGAGAAAAA 
CAATTCAAGT TGACGGGTAG xTTTTTTTTT TTCACTGCAT TTTTCAACGA AAACTAAATA 
AAATTTCGCT GATATTGATA AAGTAT 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 652 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
ATGGCGTCAA TTTCTGTTCC T^TTGAAAAA GGATCATTTC ACGATGGAGA TGGATTCAAT 
CAACATCATT TAGGAGACCC AGTTATTTCA GGACCTCCCT ATATTATTAA ATTATTAAAC 
TTACCCGTCA CAGCTAATGA TTCATTTGTC CAAGACTTGT TTCAAAGCAG ATTTACCCCA 
TATGTCAAAT TTAAAATTGT AACAGACCCC GCATCAAATA TTTTGGAGAC TCATGTCATT 
AGACAAGTGG CTTTTGTGGA ATTGGAATCG GCCAGTGATA TGTCAAAAGC TTTAAAATGG 
CATGATTTGT ATTATAAGAC AAATAGAAGA GTAACTGTTG AAGTGGCAGA TTTTAATGAT 
TTTCAAAATT GTATTAAATT CAATCAAGAA CATGAACGTG AAATTATGCA AATCCAACAA 
GAATTCATTG CTCAGAAACA ACAACAACGG CAACCCAGAC ATATGGCTCT TTTAGATGAA 
TTTGAAAGAA ACCAGCGCGG TCCTGGATCA CCCTTGCATC AAAACCATGA TCACCACAAT 
CCCCACCCAC AACAACAACA ACACCATCAT TTCAATCCTA ATTTAAACAG ACCTTCAGGT 
AGATCAAGTC TTCCAATAGA TGAAACGTCT CATTCAAGAA GACTTTCTTT TG 
(2) INFORMATION FOR SEQ ID NO: 22: 
(i) SEQUENCE CHARACTERISTICS: 



60 
120 
180 
240 
300 
360 
420 
480 
540 
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626 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
652 




pi 



(A) LENGTH: 2X7 aniino acids 

(B) TYPE: ami no acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: tinknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Ala Ser He Ser Val Pro He Glu Lys Gly Ser Phe His Asp Gly 
^ 5 10 15 

Asp Gly Phe Asn Gin His His Leu Gly Asp Pro Val He Ser Glv Pro 
20 - 25 30 

Pro Tyr He He Lys Leu Leu Asn Leu Pro Val Thr Ala Asn Asp Ser 
35 40 45 

Phe Val Gin Asp Leu Phe Gin Ser Arg Phe Thr Pro Tyr Val Lys Phe 
50 55 60 

Lys He Val Thr Asp Pro Ala Ser Asn He Leu Glu Thr His Val He 

"70 75 80 

Arg Gin Val Ala Phe Val Glu Leu Glu Ser Ala Ser Asp Met Ser Lvs 
85 90 95 

Ala Leu Lys Trp His Asp Leu Tyr Tyr Lys Thr Asn Arg Arg Val Thr 
100 105 110 

Val Glu Val Ala Asp Phe Asn Asp Phe Gin Asn Cys He Lys Phe Asn 
115 120 125 

Gin Glu His Glu Arg Glu He Met Gin He Gin Gin Glu Phe He Ala 
130 135 140 

Gin Lys Gin Gin Gin Arg Gin Pro Arg His Met Ala Leu Leu Asp Glu 

ISO 155 160 

Phe Glu Arg Asn Gin Arg Gly Pro Gly Ser Pro Leu His Gin Asn His 

170 175 

Asp His His Asn Pro His Pro Gin Gin Gin Gin His His His Phe Asn 
180 185 190 

Pro Asn Leu Asn Arg Pro Ser Gly Arg Ser Ser Leu Pro He Asp Glu 
195 200 205 

Thr Ser His Ser Arg Arg Leu Ser Phe 
210 215 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1513 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 14 92 



?-23^12-.ig98j 





(D) OTHER INFORMATION: /note= "N = A or G or C or T" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GTAGTTTGTG AAGAAATTGA AACAATCGGA AAACAACAAT ATCAAACTGA TGCCCAATAA 
CACTGTATGT ACCTAGATGG ATTACCAAGA TCTACTACAT AAAATAATAA AGGAGTTCCA 
CTCACTCAAA GAGTTCAAAC CATGGGATAG CAGTGTTTTG TATGAGACGT TACTACGATC 
AGTATTAACT ACTTTGATCG AACTTTTGGG CATAGACAAT CCACCCAGTT ATCTACACCT 
CACCACCAAC AATGATAGTA TAGGTGATTT GAAAATAAAA TACTATGGAA ATGCATTAAG 
CAAGTCAATC AACGGTCATA GCATGTTGCA ATATCTTGAA TCAAAGCATG TATCGATATT 
ACAGGCCGTG GTTGAGATTA TTAATACGCG ATCATATAGA ATCAAAGAGT CTTATTCTGC 
TGTTTTCAAA GACGTTTCTC ATTTATTTGA AAAACTACTA AAGGAAAGAT ATGAAGCTGA 
ATCTAATCTA GAGGATTATA TATTGCAGTG CTTGATGTAC GAGACCCAAT TTTACCAAGG 
AATTGTTGAT AATGTTTTAA CTGCCGATGA CACCGAAAAA TTGGCTAGTT TTTTGGGGAC 
ACGACTATCT GAAGAAGATT CGATGTTTAG CTATAGGGAT ATAGATTATC CACTAGAGTT 
AAACATTAAT AATGAATCTC TTGAAAAGAT ATATAAAATT TTCTTAGGAG TCATTGGCAC 
CAAAAGATTC GATATCAAGG AGGTTGCGTC TGCTGTTGTT GGTGTGTATA AACGACACCA 
GAGAATAGAT CATTTTGT^AA AGTTGGATTC AGATGAGATT TTGGGAAAGT TTTTCAGAAA 
TATATTGCCA CAACTGTTCC AGAGTGTGAC AAATAAGGTT TTCCGGGAAT TTCACAAAGA 
GGTAGATGAC CCACCATCGG ACGTGCTAGA CCAGCTAGAT AATATTGTTG ATGACTTTAT 
TGCGGTTGGA ATTGAAGGGG TAGATTTGGG CTTTCCGGCT TTGTTCAGAC ACTACATAAA 
ATTCATGAAC GAAATTTTTC CCACTGTGGT CGAGGATGCT GACCGCGATT TTGTTGCAAG 
AATTAATAGT TTAATTGCTC AAGTCTTGGA GTTTAAAGAC GATGAAAAAT CCTGTGATAT 
CAATCAAGTG GTATCTGAAT TTGTTTCATT ACAAAGTTTG CTACTTAAGA ATAACTATCT 
TTCACCATCT ACATTATTGA TGCGTGCAAG TACTCACGAT TACTATAAAA ATTTACAGAT 
CGTGAAAATA ACCTTTGATG GATGGAATGA GAATTCAAAG AGGATATTGA AATTGGAGAA 
CAGCGGCTTT TTACAAAGCA AGACATTGCC AAAGTATTTA AAATTATGGT ACTCAAAAAG 
TATGAAGTTG AATGAATTAT GTAACCGGGT AGATGAATTT TATAATGGAG AACTTTGTCG 
GAAAGTTTTG GGCATTGTTG GGAGGGTCAC AACCAAAATG TCTATAAATC CNCAAAAATG 
GGAGGGTTGC TGA 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 478 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS : 

<D) TOPOLOGY: unknown ^ 
(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1513 





(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Asp Tyr Gin Asp Leu Leu His Lys He He Lys Glu Phe His Ser 
15 10 15 

Leu Lys Glu Phe Lys Pro Trp Asp Ser Ser Val Leu Tyr Glu Thr Leu 
20 25 30 

Leu Arg Ser Val Leu Thr Thr Leu He Glu Leu Leu Gly He Asp Asn 
35 40 45 

Pro Pro Ser Tyr Leu His Leu Thr Thr Asn Asn Asp Ser He Gly Asp 
50 55 60 ^ f 

Leu Lys He Lys Tyr Tyr Gly Asn Ala Leu Ser Lys Ser He Asn Gly 
^5 70 75 80 

His Ser Met Leu Gin Tyr Leu Glu Ser Lys His Val Ser lie Leu Gin 
85 90 95 

Ala Val Val Glu He He Asn Thr Arg Ser Tyr Arg He Lys Glu Ser 
100 105 

Tyr Ser Ala Val Phe Lys Asp Val Ser His Leu Phe Glu Lys Leu Leu 
115 120 125 

Lys Glu Arg Tyr Glu Ala Glu Ser Asn Leu Glu Asp Tyr He Leu Gin 
130 135 

Cys Leu Met Tyr Glu Thr Gin Phe Tyr Gin Gly He Val Asp Asn Val 
145 150 155 160 

Leu Thr Ala Asp Asp Thr Glu Lys Leu Ala Ser Phe Leu Gly Thr Arg 
165 170 175 

Leu Ser Glu Glu Asp Ser Met Phe Ser Tyr Arg Asp He Asp Tyr Pro 
180 185 190 

Leu Glu Leu Asn He Asn Asn Glu Ser Leu Glu Lys He Tyr Lys He 
195 200 205 

Phe Leu Gly Val He Gly Thr Lys Arg Phe Asp He Lys Glu Val Ala 
210 215 220 

Ser Ala Val Val Gly Val Tyr Lys Arg His Gin Arg He Asp His Phe 
225 230 235 240 

Glu Lys Leu Asp Ser Asp Glu He Leu Gly Lys Phe Phe Arg Asn He 
245 250 255 

Leu Pro Gin Ser Phe Gin Ser Val Thr Asn Lys Val Phe Arg Glu Phe 
260 265 270 

His Lys Glu Val Asp Asp Pro Pro Ser Asp Val Leu Asp Gin Leu Asp 
275 280 285 

Asn He Val Asp Asp Phe He Ala Val Gly He Glu Gly Val Asp Leu 
290 295 300 

Gly Phe Pro Ala Leu Phe Arg His Tyr He Lys Phe Met Asn Glu He 
205 310 315 320 

Phe Pro Thr Val Val Glu Asp Ala Asp Arg Asp Phe Val Ala Arg He 
325 330 335 

Asn Ser Leu He Ala Gin Val Leu Glu Phe Lys Asp Asp Glu Lys Ser 
340 345 350 

Cys Asp He Asn Gin Val Val Ser Glu Phe Val Ser Leu Gin Ser Leu 
355 360 365 

Leu Leu Lys Asn Asn Tyr Leu Ser Pro Ser Thr Leu Leu Met Arg Ala 



• • 

370 375 380 

Ser Thr His Asp Tyr Tyr Lys Asn Leu Gin He Val Lys He Thr Phe 
385 390 395 400 

Asp Gly Trp Asn Glu Asn Ser Lys Arg He Leu Lys Leu Glu Asn Ser 
405 410 415 

Gly Phe Leu Gin Ser Lys Thr Leu Pro Lys Tyr Leu Lys Leu Trp Tyr 
420 425 430 

Ser Lys Ser Met Lys Leu Asn Glu Leu Cys Asn Arg Val Asp Glu Phe 
435 440 445 

Tyr Asn Gly Glu Leu Cys Arg Lys Val Leu Gly He Val Gly Arg Val 
450 455 460 

Thr Thr Lys Met Ser He Asn Xaa Gin Lys Trp Glu Gly Cys 
465 470 475 

(2) INFORMATION FOR SEQ ID NO; 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 436 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



AGTTATGTCT 


CATACATACA 


ACACAGATGA 


GGACATGTGT 


TTAAATGATA 


AATTGAAATA 


60 


TTTGTACGAT 


TTATAATCGC 


TTTATCGTGA 


CAATTTCGAA 


TACTGGTACT 


TTCTACTCTA 


120 


TTTGACAAAA 


ATTTGCAAAA 


AATTGGGGAA 


AAAAATCCTG 


TTGCATTTTC 


GAGACCATCA 


180 


GTTGCAACCA 


ATCTGAATAT 


ATTTTGACAC 


TTCAATAAAT 


CTAGTGAAAC 


TAGTCGTCTA 


240 


CTTTTTAATT 


CTAATCATCT 


CATAGTATAT 


CAAGCAAAGA 


CTTACTATGC 


GTTTATCAAA 


300 


TTTAAGAAAA 


TGTAGACAGT 


ACGAAAATAC 


ACGAGTTTCC 


CAATCTTTGA 


ACTTGAAAAG 


360 


ATAGTAATAC 


CGAGATTGGC 


CAAATCCTAG 


CCATAGTCCG 


TTCATACAAA 


TTCATGAACA 


420 


ACATCTACAT 


AAGTAA 










436 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 717 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CTTTCTTTCG AATTAGATTC AATCTTTTCC AATTTTGCTT GTACACTTGC TAGTTTGAAT 



w-w:-:^-Al^:-:-:->> 



TTACGTTTTT CCTCTTTACG TTGTTTCACA ATGGCTGCAC GTTCTTCAAA ATTTATTCCC 
TTCTTCTTGG TTGGTCTTAT ATCGTTCTCA TCTTCAGGCT TCCTCTCCTC TTGTAACCCT 
TCTTTTTCTA ATAGTTTGAA ATAGTTCTTT CTTAATCTAG CCCTATGGGT TAATGCACGT 
TTTATATCTT GAGACTTGGC TTCTCGACGA TCTATAAATT TCTTTTTTGA TTTAAATGAA 
TTTTTATTAT TTGGATGCAT TGTTGTGGAG GTGTATTTGA TAGGTTGATA ACTAGAAATA 
AAAACTATGT GAAAGAACAA AATGCCAATC ACTAAAAAAA ATTTAAGATG AGTATGAAAT 
CAAAACTTTA CGACATCTTT GCGACATGCA CATTATGAGC GACATTTTGA TTCGATACCA 
GAAATAGACA GATTTAGACA GGGTCTATAA CAGAGAAATC AACAATTAAC TGGTATCAAC 
CTTAAGATTA AAAATGGTCT ATGGCGATAT GAACTGTTGT GATGAAAAAC AATATATTTG 
GAAATACTTC TTTTCATTTG ACAATTTTTT ATAAAATTTT GGCAACAATT TTGTACCTAA 
AAATTCTTTT GTCTTCAAAA GTGAAATGTA ATATAGAAAT ACTATTACAA CCAAACA 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 667 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TTTAGTTTTA TATTGATGAT GTTTTTAAGT GCTTGTTTAT CATGGTGGAT GGAAATTAGA 
ATGAGTAAAT TGAATGGAAA ATCACTGCAA CACCAACAAC TVACCACTGGT GGATACGAAA 
ATTTAGTGTA CAAATTTCTG CCAAAAAAAT ACAATAAAAA CCGCTTATAG TCTTCTACTG 
ACATAACAAC ACAAGTCAAT AAATCAACAA CTCATAAACA ATGTAGACTT AATACTATCG 
CTTAATTATT TAAACTATAA TAAATACCCT ATAGTATTAT GCCTTTGTCA ATGTGTGTAG 
AATTTGGTTA TTACATATCC ATGTGTAATA TATATGTTGA TCAAAAAACG CGATCTTCTC 
TTTGGTGTAG TGTGTTACAC AAAAAATTCA CTAGTCTAGG TCACATGATA ATCACGTGAA 
AATCAAAAAT TTGTTGAAAT TGAATTTCCT CAATTTTGAA ATTTTGTTTG AAATTTTTTT 
TTTGCTTTAC AAAAAGACTC CATTTTGTTT TCCATTTCAC AACCAATTAC TTAATTCCTC 
TTTTTCATAA TTAATAACTA TCATTACTTA CAACTACAAA CAACTACGAT CATTTCCTAA 
GAAAAAGCAA CGAGGGCGAA TTGAGACATT AATCCCCTTT ATTTTATCAT CATGCCTTAT 
ACAGAAC 

(2) INFORMATION FOR SEQ ID NO: 28; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 165 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



:EP9831GeS4l9:i 



(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
AACTATTGCC AATGGTAAAT ATGCCAGTGA AATCGAGAAT TTTAATAAGT CGGTCCCTCT 60 
TAAGGTCCCA TTCAAATTCA CTAATGCACA ATTGGATCTT TATGCTGCTA GCACACATAA 120 
CCAAGAGCCA ATATCCTAGT AACGACGCAC CATAGTAGAC CGAAT 165 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 207 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 120 

(D) OTHER INFORMATION: /note= "N = A or C or G or T" 



(ix) FEATURE: 

(A) NAME/KEY: mi3c_f eature 

(B) LOCATION: 129 

(D) OTHER INFORMATION: /no te= "N = A or T or C or G" 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 162 

(D) OTHER INFORMATION: /note= "N = A or T or C or G" 



(ix) FEATURE: 

(A) NAME/iOEY: mi3c_feature 

(B) LOCATION: 178 

(D) OTHER INFORMATION: /note= "N = A or T or C or G" 

(ix) FEATURE: 

(A) NAME/KEY: ini3C_f eature 

(B) LOCATION: 194 

(D) OTHER INFORMATION: /note= "N = A or T or C or G" 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 195 

(D) OTHER INFORMATION: /not e= "N = A or T or C or G' 



(ix) FEATURE: 

(A) NAME/KEY: misc^f eature 

(B) LOCATION; 199 

(D) OTHER INFORMATION: /note= "N = A or T or C or G" 

(ix) FEATURE: 

(A) NAME/KEY: mi3C_feature 

(B) LOCATION: 203 

(D) OTHER INFORMATION: /not e= "N = A or T or C or G" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
ATGAAGATTT CACCAGAGAC AGTAAATAAA CTACAACTGG ATGCATCGTG TATAAGAAAC 



mi 



207 



ATCTGTATTT TAGCACATGT CGACCACGGT AAAACCTCAT TGAGTGACTC ATTATTAGCN 120 

ACCAATGGNA TCATTTCCCA ACGTATGGCA GGTAAAGTTA GNTATCTTGA TTCGAGANGA 18 0 
GATGAACAAT TGANNGGTNT AANCATG 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Met Lys lie Ser Pro Glu Thr Val Asn Lys Leu Gin Ser Asp Ala Ser 
^5 10 15 

Cys He Arg Asn He Cys He Leu Ala His Val Asp His Gly Lys Thr 
20 25 30 

Ser Leu Ser Asp Ser Leu Leu Xaa Thr Asn Xaa He He Ser Gin Arg 
35 40 45 

Met Ala Gly Lys Val Xaa Tyr Leu Asp Ser Arg Xaa Asp Glu Gin Leu 
50 55 60 

Xaa Gly Xaa Xaa Met 
65 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2510 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION; 2481 

(D) OTHER INFORMATION: /note= "N = A or T or C or G" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

AAGTCATGCG ATTGCAACAA GGATCACAAG AACCAGAAGT TCACGAACAT TTGATTAATT 60 

TGATTGATTC ACCTGGGCAT ATTGACTTTT CGTCTGAAGT GAGTACTTCT TCGAGATTAT 120 

GTGATGGTGC AGTTGTTTTG GTCGATGTCG TCGAAGGTGT CTGCTCACAA ACAGTCAACG 18 0 

TTCTACGCCA ATGTTGGATT GATAAGTTGA AGCCATTACT AGTTATTAAC AAAATTGATA 24 0 

GGTTAATCAC AGAATGGAAA TTGTCTCCCT TGGAGGCATA CCAACACATT TCCAGAATTA 300 

TAGAACAAGT AAACTCTGTG ATTGGGTCAT TTTTTGCTGG TGATAGACTA GAAGATGACT 360 

TGAATTGGCG TGAGGCTGGT TCTGTCGGGG AGTTTATCGA GAAGAGTGAT GAAGACTTGT 420 



ATTTCACACC 


TGAAAAGAAT 


AATGTAATAT 


TTGCCTCGGC 


AATAGATGGA 


TGGGCATTTT 


480 


CAGTCAATAC 


ATTTGCCAAA 


ATATACCTGA 


AAAAATTAGG 


GTTCTCTCAA 


CAAGCATTGT 


540 


CAAAAACTCT 


CTGGGGAGAC 


TTTTACTTGG 


ATATGAAAAA 


TAAAAAAATC 


ATCCCTGGTA 


600 


AAAAATTGAA 


AAATAATAGT 


AACAGTTTGA 


AGCCATTATT 


TGTTTCGTTG 


ATTTTGGACC 


660 


AGGTTTGGGC 


TGTTTATGAA 


AACTGTGTTA 


TTGAAAGAAA 


TCAAGACAAG 


TTGGAAAAAA 


720 


TCATTGAGAA 


ATTAGGGGCC 


AAAATCACCC 


CTCGTGATTT 


GCGATCCAAA 


GATTACAAGA 


780 


ACTTGCTAAA 


CTTGATTATG 


TCTCAGTGGA 


TTCCTTTGAG 


TCATGCCATA 


TTGGGGTCAG 


840 


TGATTGAATA 


CTTGCCAAGC 


CCCATTGTTG 


CTCAGCGTGA 


AAGAATAGAC 


AAGATTTTGG 


900 


ATGAAACGAT 


TTATAGTGCA 


GTGGATTCAG 


AACTGAGATA 


AATCCAAACT 


AGTCGACCCT 


960 


TCATTTGTCA 


AGGCGATGCA 


GGAATGTGAT 


AGTTCACACC 


CGGAAACCCA 


TACAATAGCA 


1020 


TATGTATCAA 


AATTGTTGTC 


AATCCCCAAT 


GAAGACTTAC 


CCAAAGCTAG 


TAATGCCGCT 


1080 


ACTGGAGGAT 


TGACGGCCGA 


TGAAATCCAA 


GAACGAGGAA 


GAATTGCTCG 


AGAATTAGCC 


1140 


AAAAAGGCAT 


CTGAAGCAGC 


TGCTTTGGCA 


CAAGAAGGTT 


CCAAAAATGA 


AGATGAGTTT 


1200 


GCCATTAAAC 


CCAAGAAAGA 


TCCATTTGAA 


TGGGAATTTG 


AGGAGGACGA 


TTTTGAGAAT 


1260 


GAGGAAGATG 


AGAGCGATGC 


AAACGCAGTT 


GAAGAATCAA 


CTGAAACCAT 


AGTGGGTTTC 


1320 


ACTCGTATTT 


ATTCTGGATC 


GTTATCTAGA 


GGCCAAAAGC 


TCACGGTAAT 


TGGACCCAAA 


1380 


TACGACCCTT 


CATTACCTAG 


AGACCATCAA 


ACCAACTTTG 


AACAAATAAC 


CAATGAAGTT 


1440 


GAAATTAAAG 


ACTTGTTTTT 


AATCATGGGA 


CGAGAATTAG 


TGAGAATGGA 


AAAAGTCCTG 


1500 


CGGGTAATAT 


TGTTGGGGTT 


GTTGGATTGG 


ATACGCCGTG 


CTTAAGAATG 


CCACAATTTG 


1560 


CTCACCGTTA 


CCTGAAGATA 


AACCATACAT 


TAATTTAGCT 


TCAACATCAA 


CCTTGATCCA 


1620 


CAATAAACCA 


ATTATGAAAA 


TAGCAGTTGA 


ACCAACAAAC 


CCAATAAAAC 


TAGCAAAATT 


, 1680 


GGAACGAGGA 


TTAGATTTAT 


TGGCCAAAGC 


CGACCCGGTT 


TTGGAATGGT 


ATGTCGACGA 


1740 


CGAGTCAGGT 


GAATTGATTG 


TTTGTGTTGC 


TGGAGAATTG 


CATCTAGAAC 


GATGCTTGAA 


1800 


AGATTTAGAA 


GAGAGATTCG 


CTAAGGGTTG 


TGAAGTTACC 


GTCAAAGAGC 


CAGTCATTCC 


1860 


CTTCAGAGAG 


GGGTTGGCAG 


ATGACAAAAT 


CAGTACCAAC 


ACCAATAATA 


ACAACGACGA 


1920 


CAATGAAGAT 


CATGAATTAG 


ATGAAAACGA 


AGATGAGCTT 


GCTGATTTAG 


AGTTTGATAT 


1980 


TTCTCCGTTG 


CCATTAGAAG 


TGACTCAGTT 


TTTAATTGAG 


AATGAAACGA 


TTATTGCCGA 


2040 


AATTGTCAAC 


AACAAGCAAG 


ATACTCATGA 


AATTAGAAAC 


GATTTTATTG 


AAAAATTTGC 


2100 


CACTATTATT 


GATAATTCTA 


ATTTGGCTAC 


ACAATTTCCA 


GACACCAAGT 


CTTTTATCAA 


2160 


CAATATAATT 


TGCTTTGGAC 


CTAAACGTGT 


TGGGCCTAAT 


ATTTTCATTG 


AAGATTATGG 


2220 


GTTAAACAAA 


TTTAGACATC 


TACTTGGTGA 


ATCTGCCACT 


GAATCTCGAT 


TTGTTTATGA 


2280 


GAATAATGTG 


TTCAATGGGG 


TTCAATTGGT 


ATTCAATGGG 


GGTCCGTTAG 


CATCAGAGCC 


2340 


AATGCAAGGT 


ATTATTGTTA 


GACTTAAGAA 


GGCAGAAAAA 


AGAGAAGTTG 


ACGAGGATAA 


2400 


GATAGTCAAC 


CCTGGTAAAA 


TAATCACACA 


GACTCGTGAC 


TTGATTTACA 


AGCGGTTTTT 


2460 


GCAAAAATCA 


CCACGCTTGT 


NCCTTGCAAT 


GTATACGTGT 


GAAATCCAAG 




2510 


(2) INFORMATION FOR SEQ ID NO: 32: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 310 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

val Met Arg Leu Gin Gin Gly Ser Gin Glu Pro Glu Val His Glu His 
1 5 10 



15 



Leu He Asn Leu He Asp Ser Pro Gly His He Asp Phe Ser Ser Glu 
20 25 30 

Val Ser Thr Ser Ser Arg Leu Cys Asp Gly Ala Val Val Leu Val Asp 
35 40 45 

Val Val Glu Gly Val Cys Ser Gin Thr Val Asn Val Leu Arg Gin Cys 

55 60 

Trp He Asp Lys Leu Lys Pro Leu Leu Val He Asn Lys He Asp Arg 

70 75 80 

Leu He Thr Glu Trp Lys Leu Ser Pro Leu Glu Ala Tyr Gin His He 
85 90 95 

Ser Arg He He Glu Gin Val Asn Ser Val He Gly Ser Phe Phe Ala 
100 105 110 

Gly Asp Arg Leu Glu Asp Asp Leu Asn Trp Arg Glu Ala Gly Ser Val 
il5 120 125 

Gly Glu Phe He Glu Lys Ser Asp Glu Asp Leu Tyr Phe Thr Pro Glu 
■^^^ 135 140 

Lys Asn Asn Val He Phe Ala Ser Ala He Asp Gly Trp Ala Phe Ser 

150 155 

Val Asn Thr Phe Ala Lys He Tyr Ser Lys Lys Leu Gly Phe Ser Gin 
165 170 175 

Gin Ala Leu Ser Lys Thr Leu Trp Gly Asp Phe Tyr Leu Asp Met Lys 
ISO 185 190 

Asn Lys Lys He He Pro Gly Lys Lys Leu Lys Asn Asn Ser Asn Ser 
195 200 205 

Leu Lys Pro Leu Phe Val Ser Leu He Leu Asp Gin Val Trp Ala Val 
210 215 220 

Tyr Glu Asn Cys Val He Glu Arg Asn Gin Asp Lys Leu Glu Lys He 

230 235 240 

He Glu Lys Leu Gly Ala Lys He Thr Pro Arg Asp Leu Arg Ser Lvs 
245 250 255 

Asp Tyr Lys Asn Leu Leu Asn Leu He Met Ser Gin Trp He Pro Leu 
260 265 270 

Ser His Ala He Leu Gly Ser Val He Glu Tyr Leu Pro Ser Pro He 
275 280 285 

^« ^^"^ ^5 I'eu Asp Glu Thr He Tyr 

290 295 300 



Ser Ala Val Asp Ser Glu 
305 310 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 188 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 33: 

Asp Lys Ser Lys Leu Val Asp Pro Ser Phe Val Lys Ala Met Gin Glu 
15 10 15 

Cys Asp Ser Ser His Pro Glu Thr His Thr lie Ala Tyr Val Ser Lys 
20 25 30 

Leu Leu Ser He Pro Asn Glu Asp Leu Pro Lys Ala Ser Asn Ala Ala 
35 40 45 

Thr Gly Gly Leu Thr Ala Asp Glu He Gin Glu Arg Gly Arg He Ala 
50 55 60 

Arg Glu Leu Ala Lys Lys Ala Ser Glu Ala Ala Ala Leu Ala Gin Glu 
65 70 75 80 

Gly Ser Lys Asn Glu Asp Glu Phe Ala He Lys Pro Lys Lys Asp Pro 
85 90 95 

Phe Glu Trp Glu Phe Glu Glu Asp Asp Phe Glu Asn Glu Glu Asp Glu 
100 105 110 

Ser Asp Ala Asn Ala Val Glu Glu Ser Thr Glu Thr He Val Gly Phe 
115 120 125 

Thr Arg He Tyr Ser Gly Ser Leu Ser Arg Gly Gin Lys Leu THr Val 
130 135 140 

He Gly Pro Lys Tyr Asp Pro Ser Leu Pro Arg Asp His Gin Thr Asn 
145 150 155 160 

Phe Glu Gin He Thr Asn Glu Val Glu He Lys Asp Leu Phe Leu He 
165 170 175 

Met Gly Arg Glu Leu Val Arg Met Glu Lys Val Ser 
180 185 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 34: 

Gly Asn He Val Gly Val Val Gly Leu Asp Xaa Ala Val Leu Lys Asn 
^5 10 15 

Ala Thr He Cys Ser Pro Leu Pro Glu Asp Lys Pro Tyr He Asn Leu 
20 25 30 

Ala Ser Thr Ser Thr Leu He His Asn Lys Pro He Met Lys He Ala 
35 40 45 

Val Glu Pro Thr Asn Pro He Lys Leu Ala Lys Leu Glu Arg Gly Leu 
50 55 60 

Asp Leu Leu Ala Lys Ala Asp Pro Val Leu Glu Trp Tyr Val Asp Asp 
^5 70 75 80 

Glu Ser Gly Glu Leu He Val Cys Val Ala Gly Glu Leu His Leu Glu 
85 90 95 

Arg Cys Leu Lys Asp Leu Glu Glu Arg Phe Ala Lys Gly Cys Glu Val 
100 105 110 

Thr Val Lys Glu Pro Val He Pro Phe Arg Glu Gly Leu Ala Asp Asp 
115 120 125 

Lys He Ser Thr Asn Thr Asn Asn Asn Asn Asp Asp Asn Glu Asp His 
130 135 

Glu Leu Asp Glu Asn Glu Asp Glu Leu Ala Asp Leu Glu Phe Asp He 
1^5 ISO 155 160 

Ser Pro Leu Pro Leu Glu Val Thr Gin Phe Leu He Glu Asn Glu Thr 
165 170 175 

He He Ala Glu He Val Asn Asn Lys Gin Asp Thr His Glu He Arg 
180 185 190 

Asn Asp Phe He Glu Lys Phe Ala Thr He He Asp Asn Ser Asn Leu 
195 200 205 

Ala Thr Gin Phe Pro Asp Thr Lys Ser Phe He Asn Asn He He Cys 
210 215 220 

Phe Gly Pro Lys Arg Val Gly Pro Asn He Phe He Glu Asp Tyr Glv 
225 230 235 240 

Leu Asn Lys Phe Arg His Leu Leu Gly Glu Ser Ala Thr Glu Ser Arg 
245 250 255 

Phe Val Tyr Glu Asn Asn Val Phe Asn Gly Val Gin Leu Val Phe Asn 
260 265 270 

Gly Gly Pro Leu Ala Ser Glu Pro Met Gin Gly He He Val Arg Leu 
275 280 285 

Lys Lys Ala Glu Lys Arg Glu Val Asp Glu Asp Lys He Val Asn Pro 
290 295 300 

Gly Lys He He Thr Gin Thr Arg Asp Leu He Tyr Lys Arg Phe Leu 
305 310 315 320 

Gin Lys Ser Pro Arg Leu Xaa Leu Ala Met Tyr Thr Cys Glu He Gin 
325 330 335 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 841 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 





(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(ix) FEATXmE: 

(A) NAME/KEY: misc^f eature 

(B) LOCATION: 8 

(D) OTHER INFORMATION: /not e= "N = A or T or C or G" 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 
(B> LOCATION: 9 

(D) OTHER INFORMATION: /note= "N = A or T or G or C 

(ix) FEATURE: 

(A) NAME/KEY: mi3C_feature 

(B) LOCATION: 18 

(D) OTHER INFORMATION: /not e= "N - A or T or C or G" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
CGCGAAGNNT CAATCATNTC AGAAGAAATG AAAGAAGGTA CTCCGTTCTT TACTATTGTG 
GCAAGAATCC CTGTGATTGA GGCATTTGGG TTTTCCGAGG ATATTAGAAA GAAGACATCC 
GGGGCAGCTA GTCCTCAATT AGTTTTTGAT GGGTATGATA TGTTAGATAT CGATCCATTT 
TGGGTTCCAC ATACTGAAGA AGAATTAGAA GAATTGGGTG AATTTGCAGA AAGAGAAAAT 
GTTGCTAGAA GATATATGAA TAATATCAGA AGAAGAAAAG GGTTATTTGT TGATGAGAAA 
GTCGTCAAAA ATGCTGAAAA GCAAAGAACT TTGAAAAGAG ATTAGATTAT CCAGTAAAAC 
AGGCAATATG TGTGAAATTG TTACAGAAAA GACAGATACG ATGTGGCCAT TATTTGTTTA 
ATATTCAACA ACAAGTAAAT GTATTGATAT AGATGTATAA TATAGTCAAA TGTTGAGACT 
ATCCGAATAG ACATAGACAC ACAACTCAGC CTGTCAGGGC TGTTTATTAA GTTGTGATGT 
ATACTAAAAT CCATCCACAC TTCTCGTAAT TGTAGGGAAG AATTACAAAA AAGATCACAT 
AAAAATAATA ATTCTATCAC ACTTTGAAAA TTTGATTGAA GGTGTTACTA GTATTGTTTC 
AACATTACTC TTTTCAAACA ACGAGATCCA AATACTGCAC AATCTTCAAA CGAACGGAGT 
TACATCACTA TAGTTTTCTA TTGTTGTAAG ATCAATACAG ACAAAAAGAA AGTGTAGCAT 
AAATAATTGA TTGCAATTTG CCAAACTAGA AAACAAAGAG GAAAAAAAGA AAAAAATTTC 
A 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 ainino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unJcnown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
841 




Arg Glu Xaa Ser lie Xaa Ser Glu Glu Met Lys Glu Gly Thr Pro Phe 
15 10 15 

Phe Thr lie Val Ala Arg lie Pro Val He Glu Ala Phe Gly Phe Ser 
20 25 30 

Glu Asp He Arg Lys Lys Thr Ser Gly Ala Ala Ser Pro Gin Leu Val 
35 40 45 

Phe Asp Gly Tyr Asp Met Leu Asp He Asp Pro Phe Trp Val Pro His 
50 55 60 

Thr Glu Glu Glu Leu Glu Glu Leu Gly Glu Phe Ala Glu Arg Glu Asn 
65 70 75 80 

Val Ala Arg Arg Tyr Met Asn Asn He Arg Arg Arg Lys Gly Leu Phe 
85 90 95 

Val Asp Glu Lys Val Val Lys Asn Ala Glu Lys Gin Arg Thr Leu Lys 
100 105 110 

Arg Asp 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 64 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : si ngl e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

AACCTAAAAA TGGCTAAGTT CATCAAATCT GGTAAAGTTG CTATTGTTGT AAGAGGTCGT 60 

TACGCTGGTA AAAAAGTAGT CATTGTGAAA CCACATGATG AAGGTACCAA ATCTCACCCA 120 

TTCCCACATG CCATTGTCGC TGGTATTGAA AGAGCTCCAT TGAAGGTTAC CAAGAAGATG 180 

GATGCTAAAA AAGTTACCAA AAGAACTAAA GTCAAGCCAT TTGTTAAATT AGTAAACTAC 240 

AACCATTTAA TGCCAACTAG ATACTCATTG GATGTTGAAT CATTCAAATC TGCTGTCACT 300 

TCTGAAGCTT TAGAAGAACC ATCTCAAAGA GAAGAAGCTA AAAAAGTTGT CAAGAAGGCT 360 

TTTGAAGAAA AACATCAAGC TGGTAAGAAC AAATGGTTCT TCCAAAAATT ACACTTTTAA 420 

GAAAGGAACC ACCTTTATTT GAATGTTTGT AATATAGGTT GAATCAGAGA GACAAAGTAG 4 80 

AAGAAAATAC AAAAAAGAGA GTATATCTGT ATAGTATT^T TTAATGGGGG TCTAATTTAC 540 

TTACCACTTT ATTCGTGCAT TATT 5 64 
<2) INFORMATION FOP SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Met Ala Lya Phe lie Lys Ser Gly Lys Val Ala lie Val Val Arg Gly 
15 10 15 

Arg Tyr Ala Gly Lys Lys Val Val lie Val Lys Pro His Asp Glu Gly 
20 25 30 

Thr Lys Ser His Pro Phe Pro His Ala lie Val Ala Gly lie Glu Arg 
35 40 45 

Ala Pro Leu Lys Val Thr Lys Lys Met Asp Ala Lys Lys Val Thr Lys 
50 55 60 

Arg Thr Lys Val Lys Pro Phe Val Lys Leu Val Asn Tyr Asn His Leu 
65 70 75 80 

Met Pro Thr Arg Tyr Ser Leu Asp Val Glu Ser Phe Lys Ser Ala Val 
85 90 95 

Thr Ser Glu Ala Leu Glu Glu Pro Ser Gin Arg Glu Glu Ala Lys Lys 
100 105 110 

Val Val Lys Lys Ala Phe Glu Glu Lys His Gin Ala Gly Lys Asn Lys 
115 120 125 

Trp Phe Phe Gin Lys Leu His Phe 
130 135 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1192 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
TTTGAAACGA TTAAGTCCAA TCAAACAATC TTATTCAAAA GTACTCGCAA TACGTACAAT 
GTCAATTCCA TCTACTCAGT ACGGATTTTT TTATAATAAA GCTAGTGGTC TTAATTTGAA 
AAAAGACTTG CCGGTTAACA AGCCAGGTGC TGGTCAATTG CTTTTAAAGG TTGATGCAGT 
TGGCCTTTGT CATTCAGATT TACATGTTCT CTATGAAGGT TTGGATTGTG GTGATAATTA 
TGTGATGGGC CACGAAATTG CTGGGACTGT TGCTGAACTA GGTGAAGAGG TGAGTGAGTT 
TGCAGTTGGA GATCGTGTCG CTTGTGTCGG CCCCAATGGA TGTGGTCTTT GTAAACACTG 
TCTTACTGGT AACGATAATG TTTGTACCAA GTCGTTTTTG GATTGGTTTG GATTGGGTTA 
CAATGGAGGT TACGAGCAAT TTTTGTTAGT CAAGAGACCA AGAAACTTGG TCAAGATCCC 
TGACAATGTT ACTTCCGAGG AAGCTGCAGC TATTACGGAT GCCGTATTGA CTCCTTACCA 
TGCTATCAAG TCTGCAGGTG TTGGTCCAGC AAGTAATATA TTAATTATCG GAGCTGGTGG 
ATTAGGAGGT AACGCTATTC AAGTTGCAAA AGCATTTGGT GCGAAGGTTA CTGTTTTGGA 



TAAAAAGGAT AAGGCAAGAG ACCAAGCTAA GGCCTTTGGA GCTGACCAGG TTTACAGTGA 720 

ATTACCAGAC AGCGTTTTAC CTGGGTCATT CAGTGCTTGT TTTGATTTTG TTTCGGTTCA 780 

GGCAACATAC GATTTGTGTC AAAAGTATTG TGAGCCAAAG GGTACTATTG TTCCCGTAGG 8 40 

TCTAGGTGCA ACTTCGCTTA ACATAAATCT TGCTGATTTA GATCTTCGTG AAATTACCGT 900 

CAAGGGCTCA TTCTGGGGTA CCCTGATGGA TTTAAGAGAA GCATTTGAAT TGGCTGCACA 960 

GGGAAAGGTC AAACCAAATG TTGCTCATGC TCCATTGTCA GAATTGCCTA AGTATATGGA 1020 

GAAGTTGAGA GCCGGTGGTT ATGAAGGAAG AGTCGTGTTT AATCCATAAT ACTGAAAAGT 1080 

GAAGAAACCA TCAATAATAG CTTGGTGAGT ATGTATGGGA AATATTCATT TATGTATGTA 1140 

GGTCATTTAT ATGTGTGTAA TGATTTCTAA TCTGAATTTC GTACAATTCT TT 1192 
(2) INFORMATION FOR SEQ ID NO: 40:- 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 336 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

<D) TOPOLOGY: \mknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Met Ser lie Pro Ser Thr Gin Tyr Gly Phe Phe Tyr Asn Lys Ala Ser 
15 10 15 

Gly Leu Asn Leu Lys Lys Asp Leu Pro Val Asn Lys Pro Gly Ala Glv 
20 25 30 

Gin Leu Leu Leu Lys Val Asp Ala Val Gly Leu Cys His Ser Asp Leu 
35 40 45 

His Val Leu Tyr Glu Gly Leu Asp Cys Gly Asp Asn Tyr Val Met Gly 
^0 55 60 

His Glu He Ala Gly Thr Val Ala Glu Leu Gly Glu Glu Val Ser Glu 

70 75 80 

Phe Ala Val Gly Asp Arg Val Ala Cys Val Gly Pro Asn Gly Cys Glv 
85 90 95 

Leu Cys Lys His Cys Leu Thr Gly Asn Asp Asn Val Cys Thr Lys Ser 
100 105 110 

Phe Leu Asp Trp Phe Gly Leu Gly Tyr Asn Gly Gly Tyr Glu Gin Phe 
lis 120 125 

Leu Leu Val Lys Arg Pro Arg Asn Leu Val Lys He Pro Asp-Asn Val 
130 135 140 

Thr Ser Glu Glu Ala Ala Ala He Thr Asp Ala Val Leu Thr Pro Tyr 

150 155 -LgQ 

His Ala He Lys Ser Ala Gly Val Gly Pro Ala Ser Asn He Leu He 
165 170 175 

He Gly Ala Gly Gly Leu Gly Gly Asn Ala He Gin Val Ala Lys Ala 
180 185 190 

Phe Gly Ala Lys Val Thr Val Leu Asp Lys Lys Asp Lys Ala Arg Asp 



1 ifli 



195 200 205 

Gin Ala Lys Ala Phe Gly Ala Asp Gin Val Tyr Ser Glu Leu Pro Asp 
210 215 220 

Ser Val Leu Pro Gly Ser Phe Ser Ala Cys Phe Asp Phe Val Ser Val 
225 230 235 240 

Gin Ala Thr Tyr Asp Leu Cys Gin Lys Tyr Cys Glu Pro Lys Gly Thr 
245 250 255 

lie Val Pro Val Gly Leu Gly Ala Thr Ser Leu Asn lie Asn Leu Ala 
260 265 270 

Asp Leu Asp Leu Arg Glu lie Thr Val Lys Gly Ser Phe Trp Gly Thr 
275 280 285 

Ser Met Asp Leu Arg Glu Ala Phe Glu Leu Ala Ala Gin Gly Lys Val 
290 295 . 300 

Lys Pro Asn Val Ala His Ala Pro Leu Ser Glu Leu Pro Lys Tyr Met 
305 310 315 320 

Glu Lys Leu Arg Ala Gly Gly Tyr Glu Gly Arg Val Val Phe Asn Pro 
325 330 335 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1270 

(D) OTHER INFORMATION:/ no te= "R = A or G" 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1395 

<D) OTHER INFORMATION:/ no te= "R = A or G" 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

ATGGAAAAAA TTGACATTAA TACAAATTCA AACAAAATCC AACAAGCATA CGATAAAGTT 60 

GTTAGAGGAG ACCCAAATGC AACATTCGTC GTTTATTCTG TTGACAAAAA CGCCACTATG 120 

GACGTCACTG AAACAGGGGA CGGATCATTA GAGGATTTTG TTGAACATTT TACTGATGGA 180 

CAAGTTCAAT TTGGTTTAGC CAGGGTTACT GTTCCAGGAT CTGACGTTTC CAAAAACATC 240 

TTGTTAGGAT GGTGTCCTGA CAGTGCTCCA GCAAAATTGA GATTGTCATT TGCCAATAAT 300 

TTTGCTGATG TGTCCAGAGT ATTGAGCGGA TACCATGTGC AAATTACTGC AAGGGATCAA 3 60 

GATGATTTAG ACGTGAATGA ATTCTTGAAT AGAGTTGGTG CTGCTGCTGG TGCAAGATAT 420 

TCCACTCAAA CTTCCGGACT CAAAAAACCA TCCCCTGCTG CACCTAAACC TACTTCAAAA 4 80 

CCTGTTGTTG CTAAATCTAG TTCTGCTTCA AAACCTTCAT TTGTACCCAA ATCTACTGGG 5 40 

AAGCCTGTTG CTCCAGCTAA GCCAAAACCA AAGAACATCA CCAAGGATGC TGGTTGGGGT 600 
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GATGCTGAAG ACGTTGAGGA AAGAGACTTT GACAAGAAAC CTTTGGATAA CGTTCCATCG 660 

GCATATAAAC CAACAAAGGT TAACATTGAC GAATTGAGAA 7VACAAAAATC AGATACAACT 720 

AGCTCAACTC CTAAAACATT CAAATCTGAA CCACAAGAAG AAAAGAATGA CGATGATGGG 780 

CAATCCAAAC CTTTATCGGA AAGGATGAAA GCCTATGATC AACCATCAAG TAGTGATGGA 8 40 

AGATTGACTT CTTTACCAAA ACCAAAGATT GGACATTCTG TTGCCGATAA ATATAAAGCT 900 

AGTGCATCTG GGAATGGTGC TGCTCCTGCG TTTGGTGCTA AACCAGCATT TGGTACACAA 960 

TCAGTTGATT CAAGAAAGGA TAAATTGGTA GGTGGTTTGT CGAGAGATTT TGGTGCTGAA 1020 

AATGGAAAAA CTCCGGCACA AATTTGGGCT GAAAAAAGGG GAAAATACAA AACAGTGGCC 1080 

TCCGATGAGA AAGAAACTAA CTCAAGTGAA AAAGTTGATG AGCCAGAGGA ACATCATGCT 1140 

GCCGACTTGG CCAAAAAATT TGAAGAAAAG GCAAATATTG CTGGCGATAC TCCTTCCTTG 1200 

CCAACTAGAA ACTTACCACC AGCACCACCA GCACGAGAAA CCGCAATTCC ATCTAACGAA 1260 

AAAGACAAAR AAGAAAAGGA AGAGGAAGAA CAAGCTCCAG CACCATCTTT GCCTACTAGA 1320 

AACTTACCAC CACCGTCACA AAGACAACCT GAGCCCGAAC CAGAACCAGA AGAAGAGGAG 1380 

GAAGAAGAAG AAGARGAGGC TCCTGCTCCA AGCTTACCAG CAAGAAATCT CCCACCAGCA 14 40 

CCAAAAGCAG AAGCAGAAGA ATCAAAAAAA CAGTCAACCA CAGCCACCGC AGAGTATGAT 15 00 

TACGAAAAGG ACGAAGATAA TGAAATTGGA TTCTCCGAAG GTGACTTGAT TATTGATATT 15 60 

GAATTTGTGG ATGACGATTG GTGGCAAGGT AAACATGCTA AAACTGGTGA AGTTGGTTTG 1620 

TTTCCTGCCA CTTATGTGTC ATTAAATGAA AAAGCTGCTG ACAAAGAAGA GGAAGCCCCA 1680 

GCTCCAGCTC CAGCGCCATC ATTACCTTCT AGAGAAGAAA CACAAGCAGC ACCAGCATTA 1740 

CCAAGTAGAT CAGAGCAAAA ACCAGAATCA AAAACTGCTA CAGCTGAATA CGATTACGAA 18 00 

AAGGACGAAG ACAATGAAAT TGGTTTTTCA GAAGGTGATT TGATTGTTGA AATCGAATTT 18 60 

GTTGACGATG ATTGGTGGCA AGGAAAACAT TCCAAGACAG GAGAAGTCGG ATTGTTCCCT 1920 

GCTAACTATG TTGTCTTGAA TGAGTAGATT TAGTATAAAC AATATTCGTT TTTTTTTTAT 1980 

ATGAATCTAT AATATAAATA CAAAGAAAAG ATAAATTGGT G 2021 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 648 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Met Glu Lys lie Asp He Asn Thr Asn Ser Asn Lys He Gin Gin Ala 
^5 10 15 

Tyr Asp Lys Val Val Arg Gly Asp Pro Asn Ala Thr Phe Val Val Tvr 
20 25 30 
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Ser Val Asp Lys Asn Ala Thr Met Asp Val Thr Glu Thr Gly Asp Gly 
35 40 45 

Ser Leu Glu Asp Phe Val Glu His Phe Thr Asp Gly Gin Val Gin Phe 
50 55 60 

Gly Leu Ala Arg Val Thr Val Pro Gly Ser Asp Val Ser Lys Asn lie 
65 70 75 80 

Leu Leu Gly Trp Cys Pro Asp Ser Ala Pro Ala Lys Leu Arg Leu Ser 
85 90 95 

Phe Ala Asn Asn Phe Ala Asp Val Ser Arg Val Leu Ser Gly Tyr His 
100 105 110 

Val Gin lie Thr Ala Arg Asp Gin Asp Asp Leu Asp Val Asn Glu Phe 
115 120 125 

Leu Asn Arg Val Gly Ala Ala Ala Gly Ala Arg Tyr Ser Thr Gin Thr 
130 135 140 

Ser Gly Leu Lys Lys Pro Ser Pro Ala Ala Pro Lys Pro Thr Ser Lys 
145 150 155 160 

Pro Val Val Ala Lys Ser Ser Ser Ala Ser Lys Pro Ser Phe Val Pro 
165 170 175 

Lys Ser Thr Gly Lys Pro Val Ala Pro Ala Lys Pro Lys Pro Lys Asn 
180 185 190 

lie Thr Lys Asp Ala Gly Trp Gly Asp Ala Glu Asp Val Glu Glu Arg 
195 200 205 

Asp Phe Asp Lys Lys Pro Leu Asp Asn Val Pro Ser Ala Tyr Lys Pro 
210 215 220 

Thr Lys Val Asn He Asp Glu Leu Arg Lys Gin Lys Ser Asp Thr Thr 
225 230 235 240 

Ser Ser Thr Pro Lys Thr Phe Lys Ser Glu Pro Gin Glu Glu Lys Asn 
245 250 255 

Asp Asp Asp Gly Gin Ser Lys Pro Leu Ser Glu Arg Met Lys Ala Tyr 
260 265 270 

Asp Gin Pro Ser Ser Ser Asp Gly Arg Leu Thr Ser Leu Pro Lys Pro 
275 280 285 

Lys He Gly His Ser Val Ala Asp Lys Tyr Lys Ala Ser Ala Ser Gly 
290 295 300 

Asn Gly Ala Ala Pro Ala Phe Gly Ala Lys Pro Ala Phe Gly Thr Gin 
305 310 315 320 

Ser Val Asp Ser Arg Lys Asp Lys Leu Val Gly Gly Leu Ser Arg Asp 
325 330 335 

Phe Gly Ala Glu Asn Gly Lys Thr Pro Ala Gin He Trp Ala Glu Lys 
340 345 350 

Arg Gly Lys Tyr Lys Thr Val Ala Ser Asp Glu Lys Glu Thr Asn Ser 
355 360 365 

Ser Glu Lys Val Asp Glu Pro Glu Glu His His Ala Ala Asp Leu Ala 
370 375 380 

Lys Lys Phe Glu Glu Lys Ala Asn He Ala Gly Asp Thr Pro Ser Leu 
385 390 395 400 

Pro Thr Arg Asn Leu Pro Pro Ala Pro Pro Ala Arg Glu Thr Ala He 



405 



410 



415 



Pro Ser Asn Glu Lys Asp Lys Xaa Glu Lys Glu Glu Glu Glu Gin Ala 
420 425 430 

Pro Ala Pro Ser Leu Pro Thr Arg Asn Leu Pro Pro Pro Ser Gin Arg 
435 440 445 

Gin Pro Glu Pro Glu Pro Glu Pro Glu Glu Glu Glu Glu Glu Glu Glu 
450 455 460 

Xaa Glu Ala Pro Ala Pro Ser Leu Pro Ala Arg Asn Leu Pro Pro Ala 
465 470 475 480 

Pro Lys Ala Glu Ala Glu Glu Ser Lys Lys Gin Ser Thr Thr Ala Thr 
485 490 495 

Ala Glu Tyr Asp Tyr Glu Lys Asp Glu Asp Asn Glu lie Gly Phe Ser 
500 505 510 

Glu Gly Asp Leu lie He Asp He Glu Phe Val Asp Asp Asp Trp Trp 
515 520 525 

Gin Gly Lys His Ala Lys Thr Gly Glu Val Gly Leu Phe Pro Ala Thr 
530 535 540 

Tyr Val Ser Leu Asn Glu Lys Ala Ala Asp Lys Glu Glu Glu Ala Pro 
545 550 555 560 

Ala Pro Ala Pro Ala Pro Ser Leu Pro Ser Arg Glu Glu Thr Gin Ala 
565 570 575 

Ala Pro Ala Leu Pro Ser Arg Ser Glu Gin Lys Pro Glu Ser Lys Thr 
580 585 590 

Ala Thr Ala Glu Tyr Asp Tyr Glu Lys Asp Glu Asp Asn Glu He Gly 
595 600 605 

Phe Ser Glu Gly Asp Leu He Val Glu He Glu Phe Val Asp Asp Asp 
610 615 620 

Trp Trp Gin Gly Lys His Ser Lys Thr Gly Glu Val Gly Leu Phe Pro 
625 630 635 640 

Ala Asn Tyr Val Val Leu Asn Glu 
645 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1340 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



ATGTGTGACG 


TCGTATTAGG 


ATCTCAATGG 


GGGGATGAAG 


GTAAAGGTAA ATTAGTCGAT 


60 


TTATTATGTG 


ATGATATCGA 


TGTTTGTGCC 


AGGTGTCAAG 


GTGGTAACAA 


TGCTGGCCAC 


120 


ACAATTGTTG 


TTGGTAAAGT 


CAAGTATGAC 


TTCCACATGT 


TACCTTCTGG 


TTTGGTCAAT 


180 


CCTAAATGTC 


AAAACTTAGT 


TGGATCTGGT 


GTTGTTATCC 


ACGTTCCTTC 


CTTCTTTGCT 


240 


GAATTGGAAA 


ACTTGGAAGC 


AAAAGGGTTA 


GATTGTCGTG 


ATAGATTGTT 


TGTTTCATCT 


300 



It 



AGAGCTCATT TGGTCTTTGA CTTCCATCAA CGTACTGATA AATTGAAAGA AGCTGAATTA 360 

TCAACCAATA AGAAATCAAT AGGTACTACC GGTAAAGGTA TTGGTCCAAC TTACTCAACC 420 

AAGGCAAGTA GATCAGGTAT CAGAGTCCAC CATTTAGTCA ACCCTGATCC AGAAGCTTGG 4 80 

GAAGAATTCA AAACTAGATA TTTGAGATTA GTCGAGAGTA GACAAAAAAG ATACGGTGAA 54 0 

TTTGAATATG ATCCTAAGGA AGAATTGGCA AGATTTGAAA AATACCGTGA AACCTTGAGA 600 

^^CCATTCGTCG TCGACTCCGT CAACTTCATG CACGAAGCTA TTGCTGCCAA TAAAAAAATC 660 

I'jrTGGTTGAAG GTGCTAATGC GTTAATGTTG GATATTGATT TCGGTACTTA TCCATACGTC 720 

iCTTCTTCAT CAACTGGTAT TGGTGGTGTT TTGACTGGGT TGGGTATTCC TCCAAGAACC 780 

^TCAGAAATG TCTATGGTGT TGTTAAAGCC TACACCACTA GAGTTGGTGA GGGTCCATTC 84 0 

rCAACAGAAC AATTGAACAA GGTAGGTGAA ACTTTGCAAG ATGTTGGTGC CGAATATGGT 900 

JTTACTACTG GAAGAAAAAG AAGATGTGGT TGGTTGGATT TGGTTGTGTT GAAATATTCC 960 

:CTGATCA ACGGATACAC TTCTTTGAAC ATCACCAAAT TGGATGTTTT GGATAAATTC 1020 

\TTG AAGTTGGTGT TGCTTATAAA TTGAATGGAA AAGAGTTGCC AAGTTTCCCT 1080 

^GATTTGA TTGATTTAGC TAAAGTCGAG GTTGTGTATA AGAAATTCCC AGGTTGGGAA 114 0 

fcAAGATATCA CCGGTATCAA GAAATATGAA GACTTGCCAG AAAACGCTAA GAACTATCTT 1200 

AAATTCATTG AAGATTACTT GCAAGTTCCA ATCCAATGGG TAGGTACCGG TCCAGCTAGA 1260 

GATTCTATGT TAGAAAAGAA GATTTAGTTG TACACATGCT ACGGAAGACG ATTAGATTTG 1320 

TTTTATTAGA TTAATAACCT 1340 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 428 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

<ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44; 

Met Cys Asp Val Val Leu Gly Ser Gin Trp Gly Asp Glu Gly Lys Gly 
15 10 15 

Lys Leu Val Asp Leu Leu Cys Asp Asp lie Asp Val Cys Ala Arg Cys 
20 25 30 

Gin Gly Gly Asn Asn Ala Gly His Thr lie Val Val Gly Lys Val Lys 
35 40 45 

Tyr Asp Phe His Met Leu Pro Ser Gly Leu Val Asn Pro Lys Cys Gin 
50 55 60 

Asn Leu Val Gly Ser Gly Val Val lie His Val Pro Ser Phe Phe Ala 
65 70 75 80 

Glu Leu Glu Asn Leu Glu Ala Lys Gly Leu Asp Cys Arg Asp Arg Leu 
85 90 95 

Phe Val Ser Ser Arg Ala His Leu Val Phe Asp Phe His Gin Arg Thr 
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Asp Lys Leu Lys Glu Ala Glu Leu Ser Thr Aan Lys Lys Ser He Gly' 

±X3 120 



125 



Thr Thr Gly Lys Gly He Gly Pro Thr Tyr Ser Thr Lys Ala Ser Arg 

135 

ser Gly He Arg Val His His Leu Val Asn Pro Asp Pro Glu Ala Trp 

150 jg*; 

Glu Glu Phe Lys Thr Arg Tyr Leu Arg Leu Val Glu Ser Arg Gin Lys 

170 175 

Arg Tyr Gly Glu Phe Glu Tyr Asp Pro Lys Glu Glu Leu Ala Arg Phe 
180 185 

Glu Lys Tyr Arg Glu Thr Leu Arg Pro Phe Val Val Asp Ser Val Asn 
19» 200 205 

Phe Met His Glu Ala He Ala Ala Asn Lys Lys He Leu Val Glu Gly 

215 220 

Ala Asn Ala Leu Met Leu Asp He Asp Phe Gly Thr Tyr Pro Tyr Val 

230 235 ' 240 

- Ss.r^.e,5-S^.^hr..Gl.y He .Gl-»y-5^y^<3^^^ 115 - 

245 250 255 

Pro Pro Arg Thr He Arg Asn Val Tyr Gly Val Val Lys Ala Tyr Thr 
260 265 270 

Thr Arg Val Gly Glu Gly Pro Phe Pro Thr Glu Gin Leu Asn Lys Val 
^'^ 280 285 

Gly Glu Thr Leu Gin Asp Val Gly Ala Glu Tyr Gly Val Thr Thr Gly 

295 300 

Arg Lys Arg Arg Cys Gly Trp Leu Asp Leu Val Val Leu Lys Tyr Ser 

310 315 

Asn Ser He Asn Gly Tyr Thr Ser Leu Asn He Thr Lys Leu Asp Val 

330 

Leu Asp Lys Phe Lys Glu He Glu Val Gly Val Ala Tyr Lys Leu Asn 
-^^^ 345 350 

Gly Lys Glu Leu Pro Ser Phe Pro Glu Asp Leu He Asp Leu Ala Lys 

360 365 

Val Glu Val Val Tyr Lys Lys Phe Pro Gly Trp Glu Gin Asp He Thr 

375 380 

Gly He Lys Lys Tyr Glu Asp Leu Pro Glu Asn Ala Lys Asn Tyr Leu 

395 400 

Lys Phe He Glu Asp Tyr Leu Gin Val Pro He Gin Trp Val Gly Thr 
405 410 

Gly Pro Ala Arg Asp Ser Met Leu Glu Lys Lys He 
420 425 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2481 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 






(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
ATGACTGGTG AAGAAGATAA AAAACAACAT TTTGATGCTT CTGGTGCTTC TGCTGTAGAT 
GATAAAACAG CAACTGCAAT TTTAAGAAGA AAAAAGAAAG ATAATGCCTT GGTCGTTGAT 
GACGCCACCA ACGATGACAA TTCTGTCATA ACCATGTCGT CAAACACAAT GGAATTGTTA 
CAATTATTCC GTGGTGATAC AGTCTTGGTG AAAGGTAAGA AGAGAAAGGA CACAGTGTTG 
ATCGTTTTAG CTGATGATGA TATGCCTGAT GGCGTTGCTA GAGTTAACAG ATGTGTTCGT 
AACAATTTGC GTGTCAGATT GGGAGATATC GTTACTGTCC ATCCATGTCC TGATATTAAA 
TATGCCAACA GAATCTCAGT ATTGCCAATT GCTGATACTG TTGAAGGTAT TAATGGTTCC 
TTATTCGACC TTTACTTGAA GCCATATTTT GTTGAAGCCT ATAGACCAGT GAGAAAAGGT 
GATTTATTCA CTGTGAGGGG TGGTATGAGA CAAGTAGAAT TCAAAGTTGT TGAAGTTGAC 
CCTGAAGAAA TTGCAATTGT TGCTCAAGAT ACCATTATTC ATTGTGAAGG AGAACCTATT 
AATCGTGAAG ATGAAGAAAA TAGCTTGAAT GAAGTGGGTT ACGACGATAT TGGAGGTTGT 
AAGAAACAAA TGGCCCAAAT TAGAGAATTG GTTGAATTGC CTTTAAGACA TCCACAATTA 
TTCAAATCGA TTGGTATTAA GCCACCAAAG GGTATTTTGA TGTATGGTCC ACCTGGTACC 
GGTAAAACCA TTATGGCAAG AGCAGTGGCC AATGAAACAG GTGCCTTCTT TTTCTTAATA 
AATGGTCCAG AAATTATGTC TAAAATGGCT GGTGAGTCTG AATCCAATTT AAGAAAAGCT 
TTTGAAGAGG CTGAAAAGAA TTCTCCTTCC ATTATTTTCA TTGATGAGAT TGACTCTATT 
GCCCCAAAGA GAGACAAAAC TAATGGTGAA GTAGAAAGAA GAGTTGTTTC TCAATTGTTA 
ACCCTTATGG ATGGTATGAA GGCCAGATCT AATGTAGTTG TTATTGCTGC TACTAACAGA 
CCAAATTCTA TTGATCCTGC TTTGAGAAGA TTTGGAAGAT TCGACAGAGA AGTTGACATT 
GGTGTTCCGG ATGCTGAAGG ACGTTTAGAG ATTTTGAGAA TCCACACAAA GAATATGAAA 
TTGGCTGATG ATGTTGACTT GGAAGCCATC GCTTCTGAAA CACATGGTTT CGTTGGTGCT 
GATATTGCTT CATTATGTTC AGAAGCTGCT ATGCAACAAA TCCGTGAAAA GATGGATCTT 
ATCGACTTGG AAGAAGAAAC CATTGATACT GAAGTGTTGA ACTCTTTGGG TGTCACTCAA 
GACAACTTCA GATTTGCTCT CGGAAACTCC AACCCATCTG CCTTGCGTGA AACTGTTGTT 
GAAAATGTTA ATGTCACTTG GGATGATATT GGTGGTTTGG ACAACATTAA GAATGAATTA 
AAAGAAACCG TGGAGTATCC TGTTTTACAT CCAGATCAAT ACCAAAAATT CGGATTGGCA 
CCAACAAAAG GTGTTTTGTT CTTTGGTCCA CCAGGTACTG GTAAGACACT TTTGGCCAAG 
GCTGTTGCTA CTGAAGTTTC TGCTAATTTC ATTTCTGTCA AAGGTCCAGA ATTGTTGAGT 
ATGTGGTATG GTGAATCTGA GTCTAATATC CGTGATATAT TTGACAAGGC CAGAGCTGCT 
GCTCCTACTG TGGTGTTTTT GGATGAATTG GACTCCATTG CCAAAGCTAG AGGTGGTTCT 
CACGGTGATG CTGGTGGTGC CTCCGACAGA GTGGTCAATC AATTGTTGAC TGAAATGGAC 
GGTATGAATG CTAAGAAGAA TGTGTTTGTC ATTGGTGCCA CTAACAGACC AGATCAAATT 
GATCCTGCAT TATTGAGACC AGGTAGATTG GATCAATTAA TTTATGTCCC ATTGCCAGAT 



60 
120 
180 
240 
300 
3 60 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
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GAGCCAGCTA GATTGTCTAT TTTACAAGCT CAATTGAGAA 
TTGGACTTGA ACGAAATTGC CAAGATCACT CACGGTTTCT 
ATTGTTCAAA GATCTGCTAA ATTTGCTATT AAAGACTCTA 
AACAAGATTA AAGAAGAAAA AGAAAAGGTG AAAACTGAAG 
GAAGTTGAAG AAGAAGACCC TGTGCCTTAC ATTACCAGAG 
AAGACCGCAA AAAGATCTGT TTCAGACGCT GAATTACGTC 
CAATTGCAAG CCTCAAGAGG TCAATTTTCT AGCTTTAGAT 
ACTGATAATG GTTCAGCAGC AGGTGCCAAC TCAGGTGCAG 
GAAGACGATT TGTACAGTTG A 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 826 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



ACACTCCATT 
CGGGTGCAGA 
TTGAAGCCCA 
ATGTTGATAT 
CTCACTTTGA 
GTTATGAGTC 
TCAATGAAAA 
CTTTCGGAAA 



AGAACCTGGT 
TTTGTCTTAT 
AGTAAAGATT 
GAAGGTAGAT 
AGAGGCTATG 
TTACGCTCAA 
TGCTGGTGCC 
CGTTGAAGAG 



2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2481 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Met Thr Gly Glu Glu Asp Lys Lys Gin His Phe Asp Ala Ser Gly Ala 
15 10 15 

Ser Ala Val Asp Asp Lys Thr Ala Thr Ala lie Leu Arg Arg Lys Lys 
20 25 30 

Lys Asp Asn Ala Leu Val Val Asp Asp Ala Thr Asn Asp Asp Asn Ser 
35 40 45 

Val lie Thr Met Ser Ser Asn Thr Met Glu Leu Leu Gin Leu Phe Arg 
50 55 60 

Gly Asp Thr Val Leu Val Lys Gly Lys Lys Arg Lys Asp Thr Val Leu 
65 70 75 80 

lie Val Leu Ala Asp Asp Asp Met Pro Asp Gly Val Ala Arg Val Asn 
85 90 95 

Arg Cys Val Arg Asn Asn Leu Arg Val Arg Leu Gly Asp lie Val Thr 
100 105 110 

Val His Pro Cys Pro Asp He Lys Tyr Ala Asn Arg He Ser Val Leu 
115 120 125 

Pro He Ala Asp Thr Val Glu Gly He Asn Gly Ser Leu Phe Asp Leu 
130 135 140 

Tyr Leu Lys Pro Tyr Phe Val Glu Ala Tyr Arg Pro Val Arg Lys Gly 
145 150 155 160 

Asp Leu Phe Thr Val Arg Gly Gly Met Arg Gin Val Glu Phe Lys Val 
165 170 175 

Val Glu Val Asp Pro Glu Glu He Ala He Val Ala Gin Asp Thr He 
180 185 190 
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lie His Cys Glu Gly Glu Pro lie Asn Arg Glu Asp Glu Glu Asn Ser 
195 200 205 

Leu Asn Glu Val Gly Tyr Asp Asp lie Gly Gly Cys Lys Lys Gin Met 
210 215 220 

Ala Gin lie Arg Glu Leu Val Glu Leu Pro Leu Arg His Pro Gin Leu 
225 230 235 240 

Phe Lys Ser lie Gly He Lys Pro Pro Lys Gly He Leu Met Tyr Gly 
245 250 255 

Pro Pro Gly Thr Gly Lys Thr He Met Ala Arg Ala Val Ala Asn Glu 
260 265 270 

Thr Gly Ala Phe Phe Phe Leu He Asn Gly Pro Glu He Met Ser Lys 
275 280 285 

Met Ala Gly Glu Ser Glu Ser Asn Leu Arg Lys Ala Phe Glu Glu Ala 
290 295 300 

Glu Lys Asn Ser Pro Ser He He Phe He Asp Glu He Asp Ser He 
305 310 315 320 

Ala Pro Lys Arg Asp Lys Thr Asn Gly Glu Val Glu Arg Arg Val Val 
325 330 335 

Ser Gin Leu Leu Thr Leu Met Asp Gly Met Lys Ala Arg Ser Asn Val 
340 345 350 

Val Val He Ala Ala Thr Asn Arg Pro Asn Ser He Asp Pro Ala Leu 
355 360 365 

Arg Arg Phe Gly Arg Phe Asp Arg Glu Val Asp He Gly Val Pro Asp 
370 375 380 

Ala Glu Gly Arg Leu Glu He Leu Arg He His Thr Lys Asn Met Lys 
385 390 395 400 

Leu Ala Asp Asp Val Asp Leu Glu Ala He Ala Ser Glu Thr His Gly 
405 410 415 

Phe Val Gly Ala Asp He Ala Ser Leu Cys Ser Glu Ala Ala Met Gin 
420 425 430 

Gin He Arg Glu Lys Met Asp Leu He Asp Leu Glu Glu Glu Thr He 
435 440 445 

Asp Thr Glu Val Leu Asn Ser Leu Gly Val Thr Gin Asp Asn Phe Arg 
450 455 460 

Phe Ala Leu Gly Asn Ser Asn Pro Ser Ala Leu Arg Glu Thr Val Val 
465 470 475 480 

Glu Asn Val Asn Val Thr Trp Asp Asp He Gly Gly Leu Asp Asn He 
485 490 495 

Lys Asn Glu Leu Lys Glu Thr Val Glu Tyr Pro Val Leu His Pro Asp 
500 505 510 

Gin Tyr Gin Lys Phe Gly Leu Ala Pro Thr Lys Gly Val Leu Phe Phe 
515 520 525 

Gly Pro Pro Gly Thr Gly Lys Thr Leu Leu Ala Lys Ala Val Ala Thr 
530 535 540 

Glu Val Ser Ala Asn Phe He Ser Val Lys Gly Pro Glu Leu Leu Ser 
545 550 555 560 

Met Trp Tyr Gly Glu Ser Glu Ser Asn He Arg Asp He Phe Asp Lys 



5 65 



570 



575 



: 23-12^1 998! 



Ala Arg Ala Ala Ala Pro Thr Val Val Phe Leu Asp Glu Leu Asp Ser 
580 585 590 

lie Ala Lys Ala Arg Gly Gly Ser His Gly Asp Ala Gly Gly Ala Ser 
595 600 605 

Asp Arg Val Val Asn Gin Leu Leu Thr Glu Met Asp Gly Met Asn Ala 
610 615 620 

Lys Lys Asn Val Phe Val He Gly Ala Thr Asn Arg Pro Asp Gin He 
€25 630 635 640 

Asp Pro Ala Leu Leu Arg Pro Gly Arg Leu Asp Glh Leu He Tyr Val 
645 650 655 

Pro Leu Pro Asp Glu Pro Ala Arg Leu Ser He Leu Gin Ala Gin Leu 
660 665 670 

Arg Asn Thr Pro Leu Glu Pro Gly Leu Asp Leu Asn Glu He Ala Lys 
675 680 685 

He Thr His Gly Phe Ser Gly Ala Asp Leu Ser Tyr He Val Gin Arg 
690 695 700 

Ser Ala Lys Phe Ala He Lys Asp Ser He Glu Ala Gin Val Lys He 
705 710 715 720 

Asn Lys He Lys Glu Glu Lys Glu Lys Val Lys Thr Glu Asp Val Asp 
725 730 735 

Met Lys Val Asp Glu Val Glu Glu Glu Asp Pro Val Pro Tyr He Thr 
740 745 750 

Arg Ala His Phe Glu Glu Ala Met Lys Thr Ala Lys Arg Ser Val Ser 
755 760 765 

Asp Ala Glu Leu Arg Arg Tyr Glu Ser Tyr Ala Gin Gin Leu Gin Ala 
770 775 780 

Ser Arg Gly Gin Phe Ser Ser Phe Arg Phe Asn Glu Asn Ala Gly Ala 
785 790 795 800 

Thr Asp Asn Gly Ser Ala Ala Gly Ala Asn Ser Gly Ala Ala Phe Gly 
805 810 815 

Asn Val Glu Glu Glu Asp Asp Leu Tyr Ser 
820 825 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1918 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

TTTTTTTTTC TCCCTCTCTC TCGTTCAGAT TCTGTAGAAT TGATTGGTTG AGAGTAAAAG 60 

TCAGACTTTT TTTTTTGCTC TCCATCTAGT GGGACAAATA AGAAGTTTAA CAAAGAACGA 120 

CAAAAAATCC TCACCAGAAG AAAAAAAAAT CAATTTTCAC AGGTAAAGTT GTACGGACAG 180 

CACGACAGAC ACAAAACTAA AGTAAATCCA TGAGGAAAAA AGTAAAAAAA AAAAAATTGT 240 



TCACCACAAC 


TTCAAGAGCC 


ATTAAAACCA 


AAAATTTGGA 


ATATAAATTT 


CAACTGATTT 


300 


CTTGCTGGAT 


TTTTTTGTAT 


ATATTTGCAA 


TTGATTTCCT 


TTTACTTTTT 


TTTTTTCCAT 


360 


TTCTTCTTTT 


CCTTTTTCCA 


TCTTTTAAGT 


TTCTTTTAGA 


ATATAGTATA 


TTTATCAAAC 


420 


AATGTCTGCA 


TTCAGATCAA 


TTCAACGTTC 


AACCAACGTA 


GCCAAGAGCA 


CTTTCAAAAA 


480 


CAGCATCAGA 


ACATATGCTT 


CTGCTGAACC 


AGTATGTATT 


CACTTTTTTG 


AGGATCCGGG 


540 


CAATGTGCTT 


GGGATTTTAC 


TTTTAACGTA 


TATACAAAGA 


TAATTTACTA 


ACTTGCTTTC 


600 


TTAGACCTTA 


AAACAAAGAT 


TGGAAGAAAT 


CTTGCCAGCC 


AAAGCTGAAG 


AAGTTAAACA 


660 


ATTCAAAAAA 


GAACACGGTA 


AAACTGTCAT 


TGGTGAAGTT 


TTATTAGAAC 


AAGCTTACGG 


720 


TGGTATGAGA 


GGTATCAAAG 


GTTTAGTTTG 


GGAAGGTTCT 


GTTTTGGACC 


CAATTGAAGG 


780 


TATCCGTTTC 


AGAGGAAGAA 


CCATCCCAGA 


CATTCAAAAA 


GAATTGCCAA 


AAGCACCAGG 


840 


TGGTGAAGAA 


CCATTACCAG 


AAGCTCTTTT 


CTGGTTGTTG 


TTGACTGGTG 


AAGTTCCAAC 


900 


TGACGCCCAA 


ACTAAGGCTT 


TATCCGAAGA 


ATTTGCTGCT 


AGATCAGCAT 


TACCAAAGCA 


960 


CGTTGAAGAA 


TTGATCGACA 


GATCTCCATC 


TCACTTGCAC 


CCAATGGCTC 


AATTCTCCAT 


1020 


TGCCGTTACT 


GCTTTGGAAT 


CTGAATCCCA 


ATTTGCCCAA 


GCTTATGCTA 


AAGGTGCCAA 


1080 


CAAATCCGAA 


TACTGGAAAT 


ACACTTACGA 


AGATTCCATC 


GATTTGTTAG 


CTAAATTGCC 


1140 


AACCATTGCT 


GCTAAGATTT 


ACAGAAACGT 


TTTCCACGAT 


GGTAAATTGC 


CAGCTGCCAT 


1200 


TGACTCCAAA 


TTGGATTACG 


GTGCTAACTT 


GGCCAGTTTG 


TTAGGTTTTG 


GTGACAACAA 


1260 


GGAATTTGTT 


GAATTAATGA 


GATTGTACCT 


TACCATCCAC 


TCTGACCACG 


AAGGTGGTAA 


1320 


CGTCTCTGCA 


CACACCACCC 


ACTTGGTTGG 


TTCCGCTTTA 


TCTTCCCCAT 


TCTTGTCATT 


1380 


AGCTGCTGGT 


TTGAATGGTT 


TAGCTGGTCC 


ATTACACGGT 


AGAGCTAACC 


AAGAAGTTTT 


1440 


GGAATGGTTG 


TTCAAATTAA 


GAGAAGAATT 


AAACGGTGAC 


TACTCCAAGG 


AAGCCATTGA 


1500 


AAAATACTTG 


TGGGAAACCT 


TGAACTCCGG 


TAGAGTTGTC 


CCAGGTTACG 


GTCACGCTGT 


1560 


CTTGAGAAAG 


ACCGATCCAA 


GATACACTGC 


TCAAAGAGAA 


TTTGCTCTTA 


AACATATGCC 


1620 


AGACTACGAA 


TTGTTCAAAT 


TGGTTTCAAA 


CATTTACGAA 


GTCGCTCCAG 


GTGTTTTAAC 


1680 


CAAACACGGT 


AAGACCAAGA 


ACCCATGGCC 


AAATGTGGAC 


TCCCACTCTG 


GTGTCTTGTT 


1740 


ACAATACTAC 


GGTTTGACTG 


AACAATCTTT 


CTACACTGTC 


TTGTTCGGTG 


TTTCCAGAGC 


1800 


CTTTGGTGTC 


TTGCCACAAT 


TGATCTTGGA 


CCGTGGTATC 


GGTATGCCAA 


TTGAAAGACC 


1860 


AAAATCTTTC 


TCCACTGAAA 


AATACATTGA 


ATTGGTCAAA 


AACATCAACA 


AAGCTTAA 


1918 


(2) INFORMATION FOR SEQ ID NO: 48: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 66 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 



Met Ser Ala Phe Arg Ser He Gin Arg Ser Thr Asn Val Ala Lys Ser 
^5 10 15 

Thr Phe Lys Asn Ser He Arg Thr Tyr Ala Ser Ala Glu Pro Thr Leu 
20 25 30 

Lys Gin Arg Leu Glu Glu He Leu Pro Ala Lys Ala Glu Glu Val Lys 
35 40 45 

Gin Phe Lys Lys Glu His Gly Lys Thr Val He Gly Glu Val Leu Leu 
50 55 60 

Glu Gin Ala Tyr Gly Gly Met Arg Gly He Lys Gly Leu Val Trp Glu 
^5 70 75 80 

Gly ser Val Leu Asp Pro He Glu Gly He Arg Phe Arg Gly Arg Thr 
85 90 95 

He Pro Asp He Gin Lys Glu Leu Pro Lys Ala Pro Gly Gly Glu Glu 
100 105 110 

Pro Leu Pro Glu Ala Leu Phe Trp Leu Leu Leu Thr Gly Glu Val Pro 

120 125 

Thr Asp Ala Gin Thr Lys Ala Leu Ser Glu Glu Phe Ala Ala Arg Ser 
130 135 140 

Ala Leu Pro Lys His Val Glu Glu Leu He Asp Arg Ser Pro Ser His 

150 155 

Leu His Pro Met Ala Gin Phe Ser He Ala Val Thr Ala Leu Glu Ser 
165 170 175 

Glu Ser Gin Phe Ala Gin Ala Tyr Ala Lys Gly Ala Asn Lys Ser Glu 
180 185 190 

Tyr Trp Lys Tyr Thr Tyr Glu Asp Ser He Asp Leu Leu Ala Lys Leu 
195 200 205 

'^y^ Arg Asn Val Phe His Asp Gly Lys 
210 215 220 y 

Leu Pro Ala Ala He Asp Ser Lys Leu Asp Tyr Gly Ala Asn Leu Ala 
225 230 235 240 

Ser Leu Leu Gly Phe Gly Asp Asn Lys Glu Phe Val Glu Leu Met Arg 
245 250 255 

Leu Tyr Leu Thr He His Ser Asp His Glu Gly Gly Asn Val Ser Ala 
260 265 270 

His Thr Thr His Leu Val Gly Ser Ala Leu Ser Ser Pro Phe Leu Ser 
2*75 280 285 

Leu Ala Ala Gly Leu Asn Gly Leu Ala Gly Pro Leu His Gly Arg Ala 
290 295 300 

Asn Gin Glu Val Leu Glu Trp Leu Phe Lys Leu Arg Glu Glu Leu Asn 
305 310 315 320 

Gly Asp Tyr Ser Lys Glu Ala He Glu Lys Tyr Leu Trp Glu Thr Leu 
325 330 335 

Asn Ser Gly Arg Val Val Pro Gly Tyr Gly His Ala Val Leu Arg Lys 
340 345 350 

Thr Asp Pro Arg Tyr Thr Ala Gin Arg Glu Phe Ala Leu Lys His Met 
355 360 365 

Pro Asp Tyr Glu Leu Phe Lys Leu Val Ser Asn He Tyr Glu Val Ala 



370 375 380 

Pro Gly Val Leu Thr Lys His Gly Lys Thr Lys Asn Pro Trp Pro Asn 
385 390 395 400 

Val Asp Ser His Ser Gly Val Leu Leu Gin Tyr Tyr Gly Leu Thr Glu 
405 410 415 

Gin Ser Phe Tyr Thr Val Leu Phe Gly Val Ser Arg Ala Phe Gly Val 
420 425 430 

Leu Pro Gin Leu lie Leu Asp Arg Gly lie Gly Met Pro He Glu Arg 
435 440 445 

Pro Lys Ser Phe Ser Thr Glu Lys Tyr He Glu Leu Val Lys Asn He 
450 455 460 

Asn Lys 
465 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 678 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



TTTTCTGATT 


ATCATGTTAT 


TTGGTTAGCT 


AAACGGAATA 


ATGGGATAAT 


GGAAGCTGAA 


60 


TATCGATTAT 


ATTTATTAGT 


TATCACTTTA 


ATCATTTCAC 


CCGTAGGGTT 


AATTATGTTT 


120 


GGTGTTGGTG 


CCGCTAGAGA 


ATGGCCATGG 


CAAGTGATTT 


ATGTTGGATT 


AGGTTTCATT 


180 


GGGTTTGGTT 


GGGGATCAAT 


TGGTGATACT 


TCAATGTCTT 


ATTTAATGGA 


TGCTTATCCT 


240 


GATATTGTCA 


TTCAAGGAAT 


GGTGGGAGTA 


AGTATTATTA 


ATAATACTTT 


GGCTTGTATT 


300 


TTCACTTTTG 


CTTGTTCTTA 


TTGGTTAGAT 


GGATCAGGAA 


CACAAAACAC 


ATATATTGCC 


360 


TTGTCAATTA 


TTGATTTTGC 


TACCATAGCA 


TTGGTTTTCC 


CCTTTTTATA 


TTATGGTAAA 


420 


ACATTTAGAA 


GGAAAACTAA 


AAGACTTTAT 


GTTTCAATGG 


TTGAATTGAC 


TCAAGGGATG 


480 


GGATAAGAGA 


GTGAGTGGTA 


AAAGAATTTT 


ATTAATGATA 


CATTTATTAT 


TAGAATTACT 


540 


ACTATGGAAA 


TCCGAGTCTG 


TGTTTTTTTT 


AGAAGTATAT 


TTTAGACGTA 


TTTAGAGTTG 


600 


TTTTTCTCCT 


TTGTACTTTA 


TTTAGCATTT 


TATAATATAT 


TAATTCTAGT 


TGCATTAATA 


660 


TATATAAATA 


AAAAAACT 










678 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 159 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Ser Asp Tyr His Val He Trp Leu Ala Lys Arg Asn Asn Gly He Met 
1 5 10 



15 



Glu Ala Glu Tyr Arg Leu Tyr Leu Leu Val He Thr Leu He He Ser 
20 25 30 

Pro Val Gly Leu He Met Phe Gly Val Gly Ala Ala Arg Glu Trp Pro 
35 40 

Trp Gin Val He Tyr Val Gly Leu Gly Phe He Gly Phe Gly Trp Gly 
50 55 60 

Ser He Gly Asp Thr Ser Met Ser Tyr Leu Met Asp Ala Tyr Pro Asp 

70 75 QQ 

He Val He Gin Gly Met Val Gly Val Ser He lie Asn Asn Thr Leu 
85 90 95 

Ala Cys He Phe Thr Phe Ala Cys Ser Tyr Trp Leu Asp Gly Ser Gly 
100 105 110 

Thr Gin Asn Thr Tyr He Ala Leu Ser He He Asp Phe Ala Thr He 
115 120 125 

Ala Leu Val Phe Pro Phe Leu Tyr Tyr Gly Lys Thr Phe Arg Arg Lys 
130 135 240 

Thr Lys Arg Leu Tyr Val Ser Met Val Glu Leu Thr Gin Gly Met 

150 155 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1480 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(ix) FEATURE: 

(A) NAME/KEY: itiisc feature 

(B) LOCATION: 1060 

(D) OTHER INFORMATION: /no te= "R = A or G" 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1063 

(D) OTHER INFORMATION: /note= "Y C or T" 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1123 

(D) OTHER INFORMATION: /not e= "Y = C or T" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

TTTGGATTTT CAATTACAAG ATATTTTGCA TCATGTTGAA AGCAAATGGT TTGGTGGGTT 60 

TATTTCAGGT ATTTTCACTA ATGACAATGA CGTTGAAAAT GAATCCAAGA ACGTGTTTCA 120 

TAAATTCAAA CAAGATTTAA TGAAAATTTT GAAAGATTGT TTAACCGTAA GTGACGATAA 180 



ATCGAATATA 


GAGAGGTTTC 


TTCAGTTTAA 


TGAATTTATT 


TATTACTGCT 


TTTACTCAAT 


240 


GGA.GGAATAT 


AATTATGAAT 


TGGTTGATGA 


TTTGATAAAA 


TTTATAACTA 


TAAATATGAA 


300 


TTCTCATGGC 


AGAATAGTTA 


ATTTTGGCAC 


TAATGTTAAA 


ATTAATAAAT 


TACACGAATT 


360 


AATTAAGAAT 


TTGATTGATA 


AAGTTAATAA 


AAACAAACAA 


AATGTGACTA 


GCAACAACAA 


420 


AAACAACAAC 


AACAACAACA 


GCAACAACAA 


CAGCAACAGC 


AACAATTCCC 


AACATATTGT 


480 


TTTGATACCT 


AATGCCAACT 


GTTCCAATTT 


CCCATGGGAA 


TCGATGGAAT 


TTCTTCGTAG 


540 


TAAATCAATT 


TCAAGAATGC 


CATCAATTCA 


TATGTTACTT 


GATCTAGTCA 


AATCAAACAC 


600 


CAATAACAAG 


AACAAGTTAA 


TGTTTGTTGA 


TAAATCTAAT 


TTGTATTATT 


TGATTAATCC 


660 


CAGTGGTGAT 


TTAATTCGAT 


CAGAAAATCG 


ATTCAAAAAA 


CTATTTGAAT 


CAAATCATTT 


720 


ATGGAGAGGG 


GAAATTGGAA 


AATTATCAAG 


TAATGAACAT 


GAAGATTATC 


AAGATTCAAT 


780 


ATTATGTGAA 


ATCTTGAAAA 


GTCATTTATT 


TGTTTATATT 


GGTCATGGTG 


GTTGTGATCA 


840 


ATATATTAAA 


GTATCAAAAT 


TATTTAAAAA 


ATGTGGCAAT 


AATCAAGATT 


TACTGAATAA 


900 


ATTACCTCCT 


AGTTTATTGT 


TAGGTTGTTC 


ATCAGTTAAA 


TTAGATAATT 


GTAATTATAA 


960 


CTATAATTCC 


AGTATGTTAC 


AACCACTGGG 


TAATATTTAT 


AATTGGTTGA 


ACTGTAAATC 


1020 


GTCAATGATA 


CTCGGGAATC 


TATGGGATGT 


TACTGATAAR 


GAYATTGATA 


TTTTTACACT 


1080 


TTCATTACTA 


CAAAAATGGG 


GGTTAATAGA 


TGATTATAAT 


GGYAGTGGCC 


ATGATTATGG 


1140 


TATGAAGAAA 


TTGGATTTGA 


CTAATTGTGT 


TGTTCAAAGT 


CGAAGTAAAT 


GTACTTTGAA 


1200 


ATACTTGAAT 


GGATCAGCAC 


CTGTGGTTTA 


TGGTCTACCA 


ATGTATTTAA 


AATAGACATT 


1260 


CTGTTTGCAT 


ATAAGTTTAT 


ATATTTTAAT 


AATAAGAAAA 


AGAGCATAAT 


TTGGATCTTG 


1320 


ATTTTGTATT 


GTTTGGTTTG 


TTATGAACAA 


ATTTTGCACC 


CAATCACTAT 


CGAACTTTCT 


1380 


TTTTTAAACA 


GAGAACATTT 


AATCAACATT 


TATGTTACAT 


TTAAGCGTTT 


AAATACATAT 


1440 


TTGTGTTAGA 


TAGTTATATA 


ATGTTTGATG 


CAAACATACA 






1480 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 417 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Leu Asp Phe Gin Leu Gin Asp lie Leu His His Val Glu Ser Lys Trp 
1,5 10 15 

Phe Gly Gly Phe lie Ser Gly He Phe Thr Asn Asp Asn Asp Val Glu 
20 25 30 

Asn Glu Ser Lys Asn Val Phe His Lys Phe Lys Gin Asp Leu Met Lys 
35 40 45 

He Leu Lys Asp Cys Leu Thr Val Ser Asp Asp Lys Ser Asn He Glu 



50 



55 



60 



Arg Phe Leu Gin Phe Asn Glu Phe He Tyr Tyr Cys Phe Tyr Ser Met 

70 75 80 

Glu Glu Tyr Asn Tyr Glu Leu Val Asp Asp Leu He Lys Phe lie Thr 
85 90 95 

He Asn Met Asn Ser His Gly Arg He Val Asn Phe Gly Thr Asn Val 
100 105 110 

Lys He Asn Lys Leu His Glu Leu He Lys Asn Leu He Asp Lys Val 
115 120 125 

Asn Lys Asn Lys Gin Asn Val Thr Ser Asn Asn Lys Asn Asn Asn Asn 
1^0 135 

Asn Asn Ser Asn Asn Asn Ser Asn Ser Asn Asn Ser Gin His He Val 
1^^ ISO 155 160 

Leu He Pro Asn Ala Asn Cys Ser Asn Phe Pro Trp Glu Ser Met Glu 
165 170 175 

Phe Leu Arg Ser Lys Ser He Ser Arg Met Pro Ser He His Met Leu 
180 185 190 

Leu Asp Leu Val Lys Ser Asn Thr Asn Asn Lys Asn Lys Leu Met Phe 
195 200 205 

Val Asp Lys Ser Asn Leu Tyr Tyr Leu He Asn Pro Ser Gly Asp Leu 
210 215 220 

He Arg Ser Glu Asn Arg Phe Lys Lys Leu Phe Glu Ser Asn His Leu 

230 235 240 

Trp Arg Gly Glu He Gly Lys Leu Ser Ser Asn Glu His Glu Asp Tvr 
245 250 255 

Gin Asp Ser He Leu Cys Glu He Leu Lys Ser His Leu Phe Val Tvr 
260 265 270 

He Gly His Gly Gly Cys Asp Gin Tyr He Lys Val Ser Lys Leu Phe 
275 280 285 

Lys Lys Cys Gly Asn Asn Gin Asp Leu Ser Asn Lys Leu Pro Pro Ser 
290 295 300 

Leu Leu Leu Gly Cys Ser Ser Val Lys Leu Asp Asn Cys Asn Tyr Asn 



315 



320 



Tyr Asn Ser Ser Met Leu Gin Pro Ser Gly Asn He Tyr Asn Trp Leu 
325 330 335 

Asn Cys Lys Ser Ser Met He Leu Gly Asn Leu Trp Asp Val Thr Asp 
340 345 

Xaa Xaa He Asp He Phe Thr Leu Ser Leu Leu Gin Lys Trp Gly Leu 
355 360 365 

He Asp Asp Tyr Asn Xaa Ser Gly His Asp Tyr Gly Met Lys Lys Leu 
370 375 380 

Asp Leu Thr Asn Cys Val Val Gin Ser Arg Ser Lys Cys Thr Leu Lys 

385 orkrt 



390 



395 



400 



Tyr Leu Asn Gly Ser Ala Pro Val Val Tyr Gly Leu Pro Met Tyr Leu 
405 410 415 

Lys 



(2) INFORMATION FOR SEQ ID NO: 53: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1443 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 






CTTCTTTTAG 


AGACAATGCA 


GTGGTTTTCT 


TACCAGATGC 


ATGACCCCCA 


CCCAATAAAA 


oU 


CTATAATCGA 


TCTATTCACA 


GTATTTGATG 


CCATTTTGAT 


GGTGATGAAT 


GAT GT GATGT 


ion 


GATGCTCATC 


TTATTGGGAG 


TTTCAAAAAA 


AAAAGTTACA 


CTCGAAAAAA 


AAAAAATAGC 


1 on 


ATTATAAATA 


GAAGCTTTAC 


TATCTTATAG 


AACAAAACAA AAAACACTAT 


CTTCTAATTA 


240 


ATAATGGATG 


AT T TT GAT AG 


AGATTTAGAT 


AATGAGTT GG 

^^^^ X X X ^^^^ 


AAT TT AGT OA 


TAAATCAACG 


300 


AAAGGAATAA 


AGGTTC ATC G 


CACTTTTGAA 


AGTATGAATT 


TGAAACCTGA 


TCTTTTGAAA 


3 60 


GGAAT AT AT G 


CCTATGGATT 


TGAAGCACCA 


TCTGCTATTC 


AATCTAGGGC 


TATTATGCAG 


420 


AT CAT CAGTG 


GTAGAGACAC 


AATAGCACAG 


GCACAATCTG 


GAACTGGTAA 


AACTGCTACT 


4 80 


TTTTCTATTG 


GTAT GC TTGA 


GGTTATAGAT 


AC T AAAT CAA 


AAGAGTGT CA 


AGCAC T TATC 


540 


TTGTCTCCTA 


CTAGAGAGTT 


GGCAATTCAA 


ATACAAAATG 


TGGTCATGCA 


TTTAGGAGAT 


600 


TATATGAACA 


TTCACACCCA 


TGCCTGTATT 


GGTGGGAAAA 


ATGTC GGT GA 


GGATGTTAAG 


660 


AAATTGCAGC 

#^4^#^X X 


AAGGGCAACA 


AATAGTTAGT 


GGGACACCAG 


GTAGAGTGAT 


TGATGT GATA 


72 0 




AT C T ACAAAC 


T AGAAAT AT C 


AAGGTTCTTA 


TTTTAGATGA 


AGC TGATGAA 


-7 o r\ 


CTTTTTACAA 


AAGGGTTTAA 


AGAACAGATC 


TACGAAATCT 


ACAAACATTT 


ACCACCTTCG 


840 


GTTCAAGTAG 


TAGTTGTTAG 


TGCCACTTTG 


CCACGTGAAG 


TATTGGAGAT 




^ w U 


TTTACCACTG 


ATCCAGTGAA 


AATCTTGGTG 


AAGAGGGATG 


AGATTTCGCT 


TCTGGGAATC 


960 


AAACAATATT 


ATGTTCAATG 


TGAACGTGAA 


GATTGGAAGT 


TTGATACACT 


ATGTGATTTG 


1020 


TATGACAACC 


TTACAATAAC 


TCAAGCAGTG 


ATATTTTGTA 


ATACCAAATT 


GAAGGTGAAT 


1080 


TGGCTTGCTG 


ATCAAATGAA 


AAAGCAAAAC 


TTTACTGTTG 


TGGCAATGCA 


TGGTGATATG 


1140 


AAACAAGATG 


AACGAGATTC 


AATTATGAAC 


GATTTTAGAA 


GGGGGAATTC 


AAGAGTATTA 


1200 


ATATCTACAG 


ATGTTTGGGC 


AAGAGGTATT 


GATGTCCAAC 


AAGTCTCGTT 


GGTAATAAAT 


1260 


TATGATTTGC 


CCACCGATAA 


GGAAAACTAT 


ATTCATAGAA 


TTGGACGATC 


AGGTAGATTT 


1320 


GGTAGAAAGG 


GAACAGCTAT 


AAACTTGATA 


ACTAAAGATG 


ATGTGGTCAC 


TTTAAAAGAA 


1380 


TTGGAGAAAT 


ATTATTCAAC 


GAAAATTAAG 


GAAATGCCAA 


TGAATATTAA 


TGATATAATG 


1440 


TAA 












1443 


(2) INFORMATION FOR SEQ ID NO: S4: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 399 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 





(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: S4: 

Met Asp Asp Phe Asp Arg Asp Leu Asp Asn Glu Leu Glu Phe Ser His 
15 10 15 

Lys Ser Thr Lys Gly lie Lys Val His Arg Thr Phe Glu Ser Met Asn 
20 25 30 

Leu Lys Pro Asp Leu Leu Lys Gly lie Tyr Ala Tyr Gly Phe Glu Ala 
35 40 45 

Pro Ser Ala lie Gin Ser Arg Ala He Met Gin He He Ser Gly Arg 
50 55 60 

Asp Thr He Ala Gin Ala Gin Ser Gly Thr Gly Lys Thr Ala Thr Phe 
65 70 75 80 



Ala Leu He Leu Ser Pro Thr Arg Glu Leu Ala He Gin He Gin Asn 
100 105 110 

Val Val Met His Leu Gly Asp Tyr Met Asn He His Thr His Ala Cys 
115 120 125 

He Gly Gly Lys Asn Val Gly Glu Asp Val Lys Lys Leu Gin Gin Gly 
130 135 140 

Gin Gin He Val Ser Gly Thr Pro Gly Arg Val He Asp Val He Lys 
145 ISO 155 160 

Arg Arg Asn Leu Gin Thr Arg Asn He Lys Val Leu He Leu Asp Glu 
165 170 175 

Ala Asp Glu Leu Phe Thr Lys Gly Phe Lys Glu Gin He Tyr Glu lie 
180 185 190 

Tyr Lys His Leu Pro Pro Ser Val Gin Val Val Val Val Ser Ala Thr 
195 200 205 

Leu Pro Arg Glu Val Leu Glu Met Thr Ser Lys Phe Thr Thr Asp Pro 
210 215 220 

Val Lys He Leu Val Lys Arg Asp Glu He Ser Leu Ser Gly He Lys 
225 230 235 240 

Gin Tyr Tyr Val Gin Cys Glu Arg Glu Asp Trp Lys Phe Asp Thr Leu 
245 250 255 

Cys Asp Leu Tyr Asp Asn Leu Thr He Thr Gin Ala Val He Phe Cys 
260 265 270 

Asn Thr Lys Leu Lys Val Asn Trp Leu Ala Asp Gin Met Lys Lys Gin 
275 280 285 

Asn Phe Thr Val Val Ala Met His Gly Asp Met Lys Gin Asp Glu Arg 
290 295 300 

Asp Ser He Met Asn Asp Phe Arg Arg Gly Asn Ser Arg Val Leu He 
305 310 315 320 

Ser Thr Asp Val Trp Ala Arg Gly He Asp Val Gin Gin Val Ser Leu 



85 



90 95 



m 



25 330 335 

Val lie Asn Tyr Asp Leu Pro Thr Asp Lys Glu Asn Tyr lie His Arg 
340 345 350 

lie Gly Arg Ser Gly Arg Phe Gly Arg Lys Gly Thr Ala He Asn Leu 
355 360 365 

He Thr Lys Asp Asp Val Val Thr Leu Lys Glu Leu Glu Lys Tyr Tyr 
370 375 380 

Ser Thr Lys He Lys Glu Met Pro Met Asn He Asn Asp He Met 
385 390 395 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1020 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

AACGTTGGCC TGGCCCAGTT AATTCCGTTT CCAAGCAAAT GAATGTCGAT ACCGACATCA 60 

TCACGTTGAC CCGTTTTATT TTACAAGAAC AGCAAACTGT TGCTCCCACC GCCACCGGTG 120 

AGTTGTCGTT GTTGTTGAAT GCGCTTCAAT TTGCATTCAA GTTTATTGCC CACAATATCA 180 

GAAGAGCTGA GTTGGTCAAC CTTATTGGTG TTTCTGGCTC TGCCAACTCT ACCGGTGATG 240 

TTCAGAAGAA ATTGGATGTG ATTGGTGATG AGATCTTTAT CAATGCCATG AGATCTTCCA 300 

ACAACGTCAA GGTTTTGGTT TCTGAAGAGC AAGAAGACCT TATTGTGTTC CCAGGTGGTG 360 

GCACATATGC TGTTTGTACT GATCCAATTG ATGGGTCGTC CAATATCGAT GCTGGTGTTT 420 

CTGTTGGTAC GATTTTTGGT GTGTACAAGT TGCAAGAGGG GTCTACTGGT GGCATCAGCG 4 80 

ATGTCTTGCG TCCTGGTAAG GAGATGGTCG CTGCGGGGTA CACCATGTAC GGTGCATCTG 5 40 

CCCATTTGGC ATTGACTACA GGTCACGGTG TCAATCTTTT TACTTTGGAT ACTCAGTTGG 600 

GTGAATTTAT CTTGACCCAT CCAAACTTGA AGTTGCCAGA TACTAAGAAC ATCTACTCGT 660 

TGAATGAAGG GTACTCGAAC AAATTCCCAG AATACGTTCA AGATTATCTG AAGGACATTA 720 

AAAAGGAAGG GTACAGTTTG AGATACATTG GACTGATGGT TGCTGATGTC CATCGTACTC 78 0 

TTTTGTATGG TGGTATTTTT GCTTACCCTA CATTAAAGTT GAGAGTGTTG TATGAATGTT 840 

TCCCCATGGC CTTGTTGATG GAACAAGCAG GCGGTTCTGC TGTCACCATC AAGGGTGAGA 900 

GGATCTTGGA TATCTTGCCA AAAGGTATAC ACGACAAGAG TTCTATTGTG TTGGGATCCA 960 

AGGGTGAAGT TGAAAAGTAT TTAAAGCATG TACCAAAATA GATTATGTAG AAAATTTATG 1020 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 320 amino acids 

(B) TYPE: amino acid 

(C ) STRANDEDNESS : 



(D) TOPOLOGY: imknown 
(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Met Asn Val Asp Thr Asp lie lie Thr Leu Thr Arg Phe He Leu Gin 
15 10 15 

Glu Gin Gin Thr Val Ala Pro Thr Ala Thr Gly Glu Leu Ser Leu Leu 
20 25 30 

Leu Asn Ala Leu Gin Phe Ala Phe Lys Phe He Ala His Asn He Arg 
35 40 45 

Arg Ala Glu Leu Val Asn Leu He Gly Val Ser Gly Ser Ala Asn Ser 
50 55 60 

Thr Gly Asp Val Gin Lys Lys Leu Asp Val He Gly Asp Glu He Phe 
65 70 75 80 

He Asn Ala Met Arg Ser Ser Asn Asn Val Lys Val Leu Val Ser Glu 
85 90 95 

Glu Gin Glu Asp Leu He Val Phe Pro Gly Gly Gly Thr Tyr Ala Val 
100 105 110 

Cys Thr Asp Pro He Asp Gly Ser Ser Asn He Asp Ala Gly Val Ser 
lis 120 125 

Val Gly Thr He Phe Gly Val Tyr Lys Leu Gin Glu Gly Ser Thr Gly 
130 135 140 

Gly He Ser Asp Val Leu Arg Pro Gly Lys Glu Met Val Ala Ala Gly 
145 150 155 160 

Tyr Thr Met Tyr Gly Ala Ser Ala His Leu Ala Leu Thr Thr Gly His 
165 170 175 

Gly Val Asn Leu Phe Thr Leu Asp Thr Gin Leu Gly Glu Phe He Leu 
180 185 190 

Thr His Pro Asn Leu Lys Leu Pro Asp Thr Lys Asn He Tyr Ser Leu 
195 200 205 

Asn Glu Gly Tyr Ser Asn Lys Phe Pro Glu Tyr Val Gin Asp Tyr Ser 
210 215 220 

Lys Asp He Lys Lys Glu Gly Tyr Ser Leu Arg Tyr He Gly Ser Met 
225 230 235 240 

Val Ala Asp Val His Arg Thr Leu Leu Tyr Gly Gly He Phe Ala Tyr 
245 250 255 

Pro Thr Leu Lys Leu Arg Val Leu Tyr Glu Cys Phe Pro Met Ala Leu 
260 265 270 

Leu Met Glu Gin Ala Gly Gly Ser Ala Val Thr He Lys Gly Glu Arg 
275 280 285 

He Leu Asp He Leu Pro Lys Gly He His Asp Lys Ser Ser He Val 
290 295 300 

Leu Gly Ser Lys Gly Glu Val Glu Lys Tyr Leu Lys His Val Pro Lys 
305 310 315 320 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 825 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
<iii) HYPOTHETICAL: NO 



(2) INFORMATION FOR SEQ ID NO: 57: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 



AACCCCACCT 


TCAAAGACAA 


AGAAGATTTC 


GTCAAGCAAA 


CGAATGTCAG 


AGCAGAAAAG 


60 


AACCAAGAAC 


TAATCAAATT 


TGCCCGTGAC 


AACCTTAACC 


ATTTACCATT 


CACCGAAAAA 


120 


GACGGAGGTG 


CATGGGAAAA 


CTATGAACGA 


ATGATCAGTG 


GTATGCTCTA 


CAACTGTTTA 


180 


CAAAAAGAAT 


TGGAAACAAC 


ACGTATGTCT 


TGCAGAGACT 


ACATGTTGGA 


CTACGGCAGT 


240 


TTCAGAACTA 


GAGATTATAA 


AACAACCCAA 


GAATTTCTTG 


ATGCAAAATA 


CAAACATTTA 


300 


GAAAGTTTCA 


TTGGACATGT 


TGGCAAAAAT 


GCATTTATGG 


AATATCCAAT 


CTATTTTGAT 


360 


TATGGGTTTA 


ACACTTATTT 


GGGTGATAAT 


TTCTATTCCA 


ATTACAATTT 


GACAATTTTG 


420 


GATGTTTCCA 


TAGTCAGAAT 


TGGTAATAAT 


GTCAAGTGTG 


GTCCCAATGT 


ATCTATCCTT 


480 


ACCCCAACAC 


ACCCAGTGGA 


TCCCACTTTG 


CGCTATGATC 


AATTGGAAAA 


TGCCTTGCCT 


540 


GTGACGGTGG 


GTAACGGGGT 


CTGGTTGTGT 


GGAAGCTGTA 


CCATTCTTGG 


TGGGGTGACA 


600 


GTAGGTGATG 


GCAGCATTGT 


GGCTGCTGGT 


GCAGTTGTCA 


ACAAGGACGT 


TCCACCAAAC 


660 


ACTGTAGTTG 


CGGGAGTTCC 


TGCTAGGGTA 


GTTAAGCAGC 


TAGAACCTAG 


AGACCCTAAC 


720 


TTTGACACTA 


TGGCAGTTTT 


GAAACAATAT 


GGTATGGGTT 


ATATAGATTA 


GTAATTAGAT 


780 


TTGATGTAAT 


GTACACGACT 


ACACTATTTG 


CTGGTGTCTG 






825 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 206 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Met lie Ser Gly Met Leu Tyr Asn Cys Leu Gin Lys Glu Leu Glu Thr 

15 10 15 

Thr Arg Met Ser Cys Arg Asp Tyr Met Leu Asp Tyr Gly Ser Phe Arg 

20 25 30 

Thr Arg Asp Tyr Lys Thr Thr Gin Glu Phe Leu Asp Ala Lys Tyr Lys 

35 40 45 

His Leu Glu Ser Phe lie Gly His Val Gly Lys Asn Ala Phe Met Glu 



■:^.:co:o:o:-^^:-:•^^^^^«•^^^x•^:•:-:cw<•:w^^ 



50 55 60 

Tyr Pro lie Tyr Phe Asp Tyr Gly Phe Asn Thr Tyr Leu Gly Asp Asn 
65 70 75 80 

Phe Tyr Ser Asn Tyr Asn Leu Thr lie Leu Asp Val Ser lie Val Arg 
85 90 95 

lie Gly Asn Asn Val Lys Cys Gly Pro Asn Val Ser He Leu Thr Pro 
100 105 110 

Thr His Pro Val Asp Pro Thr Leu Arg Tyr Asp Gin Leu Glu Asn Ala 
115 120 125 

Leu Pro Val Thr Val Gly Asn Gly Val Trp Leu Cys Gly Ser Cys Thr 
130 135 140 

He Leu Gly Gly Val Thr Val Gly Asp Gly Ser He Val Ala Ala Gly 
145 150 155 160 

Ala Val Val Asn Lys Asp Val Pro Pro Asn Thr Val Val Ala Gly Val 
165 170 175 

Pro Ala Arg Val Val Lys Gin Leu Glu Pro Arg Asp Pro Asn Phe Asp 
180 185 190 

Thr Met Ala Val Leu Lys Gin Tyr Gly Met Gly Tyr He Asp 
195 200 205 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1380 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



AATTACAATC 


TGGTTTGTTA 


CTACCATATC 


CCATTAGTGT 


TATTGTCATT 


GTAGATATTG 


60 


ATAATGGTTA 


AAGGATTGGT 


TTTCATTTTT 


TGTGTAATGA 


ATGAGCCAAA 


ATAAAAAATC 


120 


AATTCGATGC 


GATGCAATGA 


AGTTTAATAA 


AATTTTTTTT 


TTTCTTTATT 


TCTTTTAATC 


180 


AACCCATCAA 


TCATTAAATT 


GAATCAATAC 


CTACCATTAA 


CATACTTCTA 


TATACATATA 


240 


TATATATAAC 


AAAATATCAT 


GGGGAAGATA 


ACAACTAGTG 


ATACTAAAAC 


AAAACAACGT 


300 


CATAATCCAT 


TATTAAAAGA 


TATTTCATCC 


CAAGGTGGGA ATTTAAGAAC 


CGTTCCAAGA 


360 


TCATCATCAT 


CATCATCATC 


ACAAAAGAAG 


AAATCATCAA 


AGAAACAAAG 


ACATAACGAT 


420 


GAAGACGACG 


AAGAAAATGG 


TGGCGGTGAA 


GGATTTTTAG 


ATGCTTCTAG 


TTCAAGAAAG 


480 


ATTTTACAAT 


TGGCAAAAGA 


ACAACAAGAT 


GAACTTGAAC 


AAGAAGATGA AATACAAAAT 


540 


AAACCTTCAT 


TTGCTCAATC 


ATTTAAAAAT 


CAACAAATAG 


ATAGTGAAGA 


AGAAGAAGAG 


600 


GAAGATGAGT 


ATTCAGATTT 


TGAAGAAGAA 


GAAGAAGTTG 


AAGAGATAGT 


ATATGATGAA 


660 


GAAGATGCAG 


AAGTTGATCC 


CAAAGATGCA 


GAATTATTTA 


ATAAATATTT 


CCAATCCAAC 


720 


GGTGAAGCTA 


ATAATAATGA 


TGATGATAAT 


TCATTTCAAC 


CAACAATAAA 


TTTAGCTGAT 


780 



• • 

AAAATCTTAG CCAAAATTC^^GAAAAAGAA TCCCAACAAC AACAACAACA ACAAAGCTCT 840 

CCAGATAATA GTAATGAAGA TGCCGTATTG TTACCACCAA AAGTCATTTT AGCTTATGAA 900 

AAAATTGGTC AAATTTTATC AACTTATACT CATGGGAAAT TACCTAAATT ATTTAAAATT 960 

TTACCAAGTT TAAAAAATTG GCAAGATGTA TTATACGTGA CAAATCCAAA TAGTTGGACT 1020 

CCTCATGCCA CATATGAAGC AACTAAATTA TTTGTGTCGA ATTTATCAAG TAATGAAGCT 1080 

ACAGTTTTCA TTGAAACTAT CTTGTTGCCA CGATTCCGTG ATTCTATTGA T^AATTCCGAT 1140 

GATCATTCAT TAAATTATCA TATTTATCGA GCATTAAAAA AATCATTATA TAAACCAGGA 1200 

GCTTTTTTCA AAGGGTTCTT GTTACCTTTA GTCGATGGTT ATTGTTCTGT ACGTGAAGCC 12 60 

ACTATTGCTG CTTCAGTGTT AACTAAAGTT TCTGTCCCTG TTTTACATTC ATGTCATTAT 1320 

TGTGGCGTAC TGATGAATAA AAAACGAGAA TCACCTGTAT TTGTCCTACG GCGAATATAA 1380 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 373 axrdno acids 

(B) TYPE: amino acid 

(C ) STRANDEDNESS : 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Met Gly Lys lie Thr Thr Ser Asp Thr Lys Thr Lys Gin Arg His Asn 
15 10 15 

Pro Leu Leu Lys Asp lie Ser Ser Gin Gly Gly Asn Leu Arg Thr Val 
20 25 30 

Pro Arg Ser Ser Ser Ser Ser Ser Ser Gin Lys Lys Lys Ser Ser Lys 
35 40 45 

Lys Gin Arg His Asn Asp Glu Asp Asp Glu Glu Asn Gly Gly Gly Glu 
50 55 60 

Gly Phe Leu Asp Ala Ser Ser Ser Arg Lys lie Leu Gin Leu Ala Lys 
65 70 75 80 

Glu Gin Gin Asp Glu Leu Glu Gin Glu Asp Glu lie Gin Asn Lys Pro 
85 90 95 

Ser Phe Ala Gin Ser Phe Lys Asn Gin Gin He Asp Ser Glu Glu Glu 
100 105 110 

Glu Glu Glu Asp Glu Tyr Ser Asp Phe Glu Glu Glu Glu Glu Val Glu 
115 120 125 

Glu He Val Tyr Asp Glu Glu Asp Ala Glu Val Asp Pro Lys Asp Ala 
130 135 140 

Glu Leu Phe Asn Lys Tyr Phe Gin Ser Asn Gly Glu Ala Asn Asn Asn 
145 150 155 160 

Asp Asp Asp Asn Ser Phe Gin Pro Thr He Asn Leu Ala Asp Lys He 
165 170 175 

Leu Ala Lys He Gin Glu Lys Glu Ser Gin Gin Gin Gin Gin Gin Gin 




180 



185 



190 



Ser Ser Pro Asp Asn Sex Asn Glu Asp Ala Val Leu Leu Pro Pro Lys 
195 200 205 

Val lie Leu Ala Tyr Glu Lys He Gly Gin He Leu Ser Thr Tyr Thr 
210 215 220 

His Gly Lys Leu Pro Lys Leu Phe Lys He Leu Pro Ser Leu Lys Asn 
225 230 235 240 

Trp Gin Asp Val Leu Tyr Val Thr Asn Pro Asn Ser Trp Thr Pro His 
245 250 255 

Ala Thr Tyr Glu Ala Thr Lys Leu Phe Val Ser Asn Leu Ser Ser Asn 
260 265 270 

Glu Ala Thr Val Phe He Glu Thr He Leu Leu Pro Arg Phe Arg Asp 
275 280 285 

Ser He Glu Asn Ser Asp Asp His Ser Leu Asn Tyr His He Tyr Arg 
290 295 300 

Ala Leu Lys Lys Ser Leu Tyr Lys Pro Gly Ala Phe Phe Lys Gly Phe 
305 310 315 320 

Leu Leu Pro Leu Val Asp Gly Tyr Cys Ser Val Arg Glu Ala Thr He 
325 330 335 

Ala Ala Ser Val Leu Thr Lys Val Ser Val Pro Val Leu His Ser Cys 
340 345 350 

His Tyr Cys Gly Val Ser Met Asn Lys Lys Arg Glu Ser Pro Val Phe 
355 360 365 

Val Leu Arg Arg He 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 823 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

AACCAACAAT GAGTCAAGTC GCTCCAAAGT GGTACCAATC AGAAGACGTT CCAGCTCCAA 60 

AACAAACCAG AAAGACTGCT CGTCCACAAA AATTACGTGC CTCTTTAGTC CCAGGTACCG 120 

TTTTAATTTT ATTGGCCGGT AGATTCAGAG GTAAAAGAGT TGTTTACTTG AAGAACTTGG 180 

AAGACAACAC CTTATTGGTT TCTGGTCCAT TCAAAGTCAA TGGTGTTCCA TTGAGAAGAG 24 0 

TTAACGCTAG ATACGTTATC GCCACCTCCA CCAAAGTCAA CGTTTCTGGT GTTGATGTTT 300 

CTAAATTCAA CGTCGAATAC TTTGCTAGAG AAAAATCTTC TAAATCTAAA AAATCCGAAG 3 60 

CTGAATTCTT CAATGAATCT CAACCAAAGA AAGAAATCAA AGCTGAAAGA GTTGCTGACC 420 

AAAAATCTGT CGATGCTGCT TTATTAAGTG AAATCAAAAA GACCCCATTA TTGAAACAAT 480 

ACTTGGCCGC TTCATTCTCT TTGAAGAACG GTGACAGACC ACACTTGTTA AAATTTTAAT 540 



370 





TTAGGTGAAA TTAATATTTT GCAAACATGT TCATGATAAA TAACAATGTG GCTTTTAAAG 600 

CAATGGATGG GATATGGTTA AGAGGATGTC TTTATATTTT GAGTTTTATA TATGGGTACT 660 

TTGTTTAATA ATGGAAGGTA TTGGCTCAGA TGAACTTCAA AATGGAGATT ACTTTTTTCT 720 

TTTACTTTTA CAATATTTTC GTCTATTTGC TGTTTAAGCT GCAAAAACAA ATTTTTAATC 780 

GGTGTATCTT AACTCTTATT CATTTTGTAT ATTTAATACA TAT 823 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 176 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Met Ser Gin Val Ala Pro Lys Trp Tyr Gin Ser Glu Asp Val Pro Ala 
15 10 15 

Pro Lys Gin Thr Arg Lys Thr Ala Arg Pro Gin Lys Leu Arg Ala Ser 
20 25 30 

Leu Val Pro Gly Thr Val Leu lie Leu Leu Ala Gly Arg Phe Arg Gly 
35 40 45 

Lys Arg Val Val Tyr Leu Lys Asn Leu Glu Asp Asn Thr Leu Leu Val 
50 55 60 

Ser Gly Pro Phe Lys Val Asn Gly Val Pro Leu Arg Arg Val Asn Ala 
65 70 75 80 

Arg Tyr Val He Ala Thr Ser Thr Lys Val Asn Val Ser Gly Val Asp 
85 90 95 

Val Ser Lys Phe Asn Val Glu Tyr Phe Ala Arg Glu Lys Ser Ser Lys 
100 105 110 

Ser Lys Lys Ser Glu Ala Glu Phe Phe Asn Glu Ser Gin Pro Lys Lys 
115 120 125 

Glu He Lys Ala Glu Arg Val Ala Asp Gin Lys Ser Val Asp Ala Ala 
130 135 140 

Leu Leu Ser Glu He Lys Lys Thr Pro Leu Leu Lys Gin Tyr Leu Ala 
145 150 155 160 

Ala Ser Phe Ser Leu Lys Asn Gly Asp Arg Pro His Leu Leu Lys Phe 
165 170 175 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 415 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



■ 



:.bESG:: 



(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
AACATTAAAG CAAGATGGAA AACGATAAAG GTCAATTAGT TGAATTATAC 
AATGTTCTGC TACCAACAGA ATCATTAAAG CCAAAGATCA CGCTTCTGTT 
TTGCTAAAGT TGATGAAGAC GGTAGAGCTA TTGCTGGTGA AAACATCACT 
GTGGTTACGT TAGAGGTAGA GGTGAAGCTG ATGACTCATT AAACAGATTG 
ACGGTTTATT GAAGAACGTC TGGTCTTACT CTCGTTAAGA GAATAGAAGA 
TTGATAATTG GGTATTTTAA GAAATTACTT TTTTTATATT GCAAATTAAT 
CTTCTGTGTA TATTTAATGT CTTAACATAA TAAAAAAAAA GAATAGAAAT 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 87 amino acids 

(B) TYPE: amino acid 

(C) 3TEU|JP)EPNESS : 

(D) TOPdLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



GTCCCAAGAA 
CAAATCTCAA 
TACGCTTTAA 
GCTCAACAAG 
ATAGACAAAA 
TTTAATCTTT 
GGTTT 



60 
120 
180 
240 
300 
360 
415 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64; 

Met Glu Asn Asp Lys Gly Gin Leu Val Glu Leu Tyr Val Pro Arg Lys 
1 5 . 10 15 

Cys Ser Ala Thr Asn Arg He He Lys Ala Lys Asp His Ala Ser Val 
20 25 30 

Gin He Ser He Ala Lys Val Asp Glu Asp Gly Arg Ala He Ala Gly 
35 40 45 

Glu Asn He Thr Tyr Ala Leu Ser Gly Tyr Val Arg Gly Arg Gly Glu 
50 55 60 

Ala Asp Asp Ser Leu Asn Arg Leu Ala Gin Gin Asp Gly Leu Leu Lys 
65 70 75 80 

Asn Val Trp Ser Tyr Ser Arg 
85 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1519 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 749 




(D) OTHER INFORMATION: /note^ "N ^ A or T or C or G" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 



ACCATGTGTC 


AAATTGCTTG 


GTCGTGTCCT 


TTCACCACAC 


ATTTTTTTGG 


ATTAAATTTC 


60 


TCGCACGCTC 


AAAAAATGAC 


TTCGACAAAA 


AGCAATGCCA 


CTCTTCCTAC 


AATTAATTCC 


120 


CTCCGCCCCT 


TCCTTTTCAT 


ATACTATCTC 


CCTTCCTTCT 


TCCTTCTCCT 


TTTATTTTTT 


180 


CAATTATTAC 


AATCTTATGT 


CATTTAAAGG 


ATTCAAAAAG 


GGTGTCCTTA 


GGGCCCCACA 


240 


GACAATGCGT 


CAGAAATTCA 


ACATGGGAGA 


AATCACCCAA 


GATGCTGTTT 


ATCTCGATGC 


300 


TGAAAGAAGA 


TTCAAAGAAA 


TCGAAACGGA 


AACAAAAAAG 


TTGAGTGAAG 


AATCCAAGAA 


360 


ATATTTCAAT 


GCTGTCAATG 


GGATGTTAGA 


TGAACAAATT 


GATTTTGCCA 


AAGCCGTGGC 


420 


TGAGATTTAT 


AAACCAATCA 


GTGGTAGATT 


ATCGGACCCC 


AGTGCTACGG 


TACCAGAAGA 


480 


TAACCCAC7UV 


GGTATTGAAG 


CATCGGAACT 


GTACCAAGCA 


GTGGTTAAAG 


ATCTCAAAGA 


540 


TACCTTAAAA 


CCCGATTTGG 


AATTGATTGA 


AAAAAGAATT 


GTTGAACCAG 


CACAAGAATT 


600 


ATTGAAGATT 


ATACAAGCTA 


TAAGGAAAAT 


GTCAGTGAAA 


AGAGACCATA 


AACAATTGGA 


660 


TTTGGATCGT 


CATAAGAGAA 


ATTTTTCTAA 


ATATGAACTG 


AAGAAAGAAA 


GAACTGTTAA 


720 


AGATGAAGAA 


AAAATGTTCA 


GTGCTCAANC 


AGAAGTAGAA 


ATTGCTCAAC 


AAGAGTACGA 


780 


TTATTATAAT 


GATTTGTTAA 


AGAATGAATT 


GCCAGTTTTG 


TTTCAAATGC 


AAAGTGATTT 


840 


TATCAAACCA 


TTGTTTGTTT 


CATTCTATTA 


CATGCAGTTG 


AATATTTTCT 


ACACATTATA 


900 


CACTAGAATG 


GAAGAGTTGA 


AAATTCCATA 


TTTTGATTTG 


TCTACTGATA 


TTGTCGAAGC 


960 


TTATACTGCC 


AAGAAGGGGA 


ACATTGAGGA 


ACAAACCGAT 


GCTATTGGAA 


TCACTCATTT 


1020 


CAAAGTCGGG 


CATGCCAAAT 


CCAAATTGGA 


AGCCACTAAA 


AGAAGACATG 


CTGCTATGAA 


1080 


TAGTCCACCT 


CCTACCGGTG 


CCAGCTCTAT 


TGCATCTACA 


GGTACTGGTG 


GTGAATTACC 


1140 


TGCATACTCC 


CCAGGAGGTT 


ACAACCAACC 


ATATGGTGAT 


AGCAAGTATC 


AACCACCATC 


1200 


TTCTCCAGCA 


ACATACCAAT 


CTCCAGTAGT 


AGCAGCCACT 


GCTCAATCTC 


CAGCTACTTA 


1260 


TCAATCGCCA 


GTGGCTACTG 


GACAACCTCC 


ATCATATTTA 


CCACAAACTC 


CAGCCAGTGC 


1320 


TCCACCACCA 


CAAGTTGGTA 


GTGGCCTTCC 


AACATGCACG 


GCTTTATACG 


ATTATACTGC 


1380 


ACAAGCCCAG 


GGTGACTTGA 


CTTTCCCTGC 


AGGAGCTGTT 


ATTGAAATTA 


TACAAAGAAC 


1440 


CGAAGATGCC 


AACGGATGGT 


GGACTGGTAA 


ATACAATGGT 


CAAACCGGTG 


TGTTCCCTGG 


1500 


TAATTATGTG 


CAATTATAG 










1519 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 40 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLEClHiE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Met Ser Phe Lys Gly Phe Lys Lys Gly Val Leu Arg Ala Pro Gin Thr 

Met Arg Gin Lys Phe Asn Met Gly Glu lie Thr Gin Asp Ala Val Tvr 
20 25 30 

Leu Asp Ala Glu Arg Arg Phe Lys Glu lie Glu Thr Glu Thr Lys Lys 
35 40 45 

Leu Ser Glu Glu Ser Lys Lys Tyr Phe Asn Ala Val Asn Gly Met Leu 
50 55 60 

Asp Glu Gin He Asp Phe Ala Lys Ala Val Ala Glu He Tyr Lys Pro 
^5 70 75 80 

He Ser Gly Arg Leu Ser Asp Pro Ser Ala Thr Val Pro Glu Asp Asn 
85 90 95 

Pro Gin Gly He Glu Ala Ser Glu Ser Tyr Gin Ala Val Val Lys Asp 
100 105 110 

Leu Lys Asp Thr Leu Lys Pro Asp Leu Glu Leu He Glu Lys Ara He 
115 120 125 

Val Glu Pro Ala Gin Glu Leu Leu Lys He He Gin Ala He Arg Lys 
130 135 140 

Met Ser Val Lys Arg Asp His Lys Gin Leu Asp Leu Asp Arg His Lys 
145 ISO 155 160 

Arg Asn Phe Ser Lys Tyr Glu Ser Lys Lys Glu Arg Thr Val Lys Asp 
165 170 175 

Glu Glu Lys Met Phe Ser Ala Gin Xaa Glu Val Glu He Ala Gin Gin 
180 185 190 

Glu Tyr Asp Tyr Tyr Asn Asp Leu Leu Lys Asn Glu Leu Pro Val Leu 
195 200 205 

Phe Gin Met Gin Ser Asp Phe He Lys Pro Leu Phe Val Ser Phe Tyr 
210 215 220 

Tyr Met Gin Leu Asn He Phe Tyr Thr Leu Tyr Thr Arg Met Glu Glu 
225 230 235 240 

Leu Lys He Pro Tyr Phe Asp Leu Ser Thr Asp He Val Glu Ala Tyr 
245 250 255 

Thr Ala Lys Lys Gly Asn He Glu Glu Gin Thr Asp Ala He Gly He 
260 265 270 

Thr His Phe Lys Val Gly His Ala Lys Ser Lys Leu Glu Ala Thr Lys 
275 280 285 

Arg Arg His Ala Ala Met Asn Ser Pro Pro Pro Thr Gly Ala Ser Ser 
290 295 



300 



He Ala Ser Thr Gly Thr Gly Gly Glu Leu Pro Ala Tyr Ser Pro Gly 
305 310 315 320 

Gly Tyr Asn Gin Pro Tyr Gly Asp Ser Lys Tyr Gin Pro Pro Ser Ser 
325 330 335 

Pro Ala Thr Tyr Gin Ser Pro Val Val Ala Ala Thr Ala Gin Ser Pro 
340 345 350 

Ala Thr Tyr Gin Ser Pro Val Ala Thr Gly Gin Pro Pro Ser Tyr Leu 
355 360 365 

Pro Gin Thr Pro Ala Ser Ala Pro Pro Pro Gin Val Gly Ser Gly Leu 



370 375 380 

Pro Thr Cys Thr Ala Leu Tyr Asp Tyr Thr Ala Gin Ala Gin Gly Asp 
385 390 395 400 

Leu Thr Phe Pro Ala Gly Ala Val lie Glu lie lie Gin Arg Thr Glu 
405 410 415 

Asp Ala Asn Gly Trp Trp Thr Gly Lys Tyr Asn Gly Gin Thr Gly Val 
420 425 430 

Phe Pro Gly Asn Tyr Val Gin Leu 
435 440 

<2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 855 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECXJLE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

ATAATTTTCA GAAAGAGACT AGATTCTGAT AGAAATATAG ACGCATCACT ATATTTTGGA 60 

AATATAGATC CACAAGTTAC GGAGTTGTTA ATGTATGAGT TGTTCATCCA ATTTGGTCCC 120 

GTCAAATCAA TCAATATGCC AAAGGATCGT ATATTGAAAA CACACCAGGG GTATGGATTT 180 

GTCGAATTTA AAAACTCAGC AGATGCCAAA TATACTATGG AAATACTACG AGGAATAAGA 2 40 

CTTTATGGAA AAGCATTGAA ATTGAAACGA ATTGATGCCA 7VGTCTCAGTC ATCAACAAAC 300 

AACCCAAATA ATCAAACAAT AGGAACATTT GTACAATCAG ATTTGATCAA TCCAAATTAC 360 

ATAGATGTTG GAGCTAAACT ATTTATCAAC AATCTTAATC CATTGGTCGA TGAATCCTTT 420 

TTAATGGATA CGTTTAGTAA GTTTGGAACC CTTATAAGAA ACCCAATAAT TAGACGTGAT 4 80 

TCAGAGGGAC ACTCTTTGGG ATACGGATTT CTTACGTACG ATGACTTTGA AAGTAGTGAT 5 40 

TTATGCATAC AAAAAATGAA CAACACGATT TTGATGAATA ACAAAATTGC TATCAGTTAT 600 

GCATTCAAGG ATCTGAGTGT TGATGGGAAG AAATCCCGGC ATGGAGATCA AGTGGAGCGG 660 

AAATTGGCTG AAAGTGCCAA AAAGAATAAT TTGTTGGTAA CGAAAACTTC TAAGGCAGGT 720 

ACGACGAAGG GAAATAAAAG GAAGAATAAA CCACATAAAG TGACCAAACC GTGAGACAAT 780 

GAGTTAGCTC CCCCTTTCAA AATAAGTAGA GTATCACCAT AGTTTATGAA ACAATTGATA 840 

TATTAAGCTT CTCTG ^55 
(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 257 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: \in known 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

lie He Phe Arg Lys Arg Leu Asp Ser Asp Arg Asn He Asp Ala Ser 
. ^ 5 10 15 

Leu Tyr Phe Gly Asn He Asp Pro Gin Val Thr Glu Leu Leu Met Tvr 
20 25 30 

Glu Leu Phe He Gin Phe Gly Pro Val Lys Ser He Asn Met Pro Lys 
35 40 45 

Asp Arg He Leu Lys Thr His Gin Gly Tyr Gly Phe Val Glu Phe Lys 
^0 55 60 

Asn Ser Ala Asp Ala Lys Tyr Thr Met Glu He Leu Arg Gly He Arg 

70 75 QQ 

Leu Tyr Gly Lys Ala Leu Lys Leu Lys Arg He Asp Ala Lys Ser Gin 
85 90 95 

Ser Ser Thr Asn Asn Pro Asn Asn Gin Thr He Gly Thr Phe Val Gin 
100 105 110 

Ser Asp Leu He Asn Pro Asn Tyr He Asp Val Gly Ala Lys Leu Phe 
115 120 125 

He Asn Asn Leu Asn Pro Leu Val Asp Glu Ser Phe Leu Met Asp Thr 
130 135 140 

Phe Ser Lys Phe Gly Thr Leu He Arg Asn Pro He He Arg Arg Asp 

ISO 155 160 

Ser Glu Gly His Ser Leu Gly Tyr Gly Phe Leu Thr Tyr Asp Asp Phe 
165 170 175 

Glu Ser Ser Asp Leu Cys He Gin Lys Met Asn Asn Thr He Leu Met 
180 185 190 

Asn Asn Lys He Ala He Ser Tyr Ala Phe Lys Asp Ser Ser Val Asp 
195 200 205 

Gly Lys Lys Ser Arg His Gly Asp Gin Val Glu Arg Lys Leu Ala Glu 
210 215 220 

Ser Ala Lys Lys Asn Asn Leu Leu Val Thr Lys Thr Ser Lys Ala Gly 

230 235 240 

Thr Thr Lys Gly Asn Lys Arg Lys Asn Lys Pro His Lys Val Thr Lys 
245 250 255 

Pro 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1685 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

CTGTTTATTA AATGGATATA TGTTAAACCA TGAACTTCGG TTTATCAGAA AAATTGGTGC 60 

TGGTACCTAT GGTTTGATTT ACCTTGTGGA AAATATCTAC ACTAAACAAC AATTTGCTGC 120 

TAAAATGGTT CTTGAACAGC CATTACTCAA ACAAAAGCAA CAACAACAAC AAAGTCATCA 180 

TGGACATAAA GGAGAATCTA GTATGAACAA ACAAATAATA CTGCAAGAAT TTTATCAATA 24 0 

TTTTTTAAAC AATAGTATGC CACAACCACG AAATTTGGAC TTGAATTACC TTCGAGACAA 300 

CGGACATGAT TGCCCCTTTT TGACTGAAAT CTCATTACAT TTAAAAGTAC ATCAACACCC 360 

AAACATAGCG ACTATTCATC AAGTATTAAA CATTGAAGAT TTTGCCATAA TAATATTGAT 420 

GGATCATTTT GAGCAAGGAG ATTTGTTCAC TAATATCATT GATAGACAAA TATTCACCAA 4 80 

TAATAGTCAT AGAAAAGTTC CAAGAACAGA TTTTGAAACC CAATTATTAA TGAAGAATGC 5 40 

CATGTTACAA TTGATAGAAG CCATTGAATA TTGTCACGAA AATAATATTT ACCATTGTGA 600 

TTTAAAACCA GAAAACATTA TGGTTAGATA TAATCCATAC TATGTTCGTC CAACTATCAA 660 

TAACAATAAT AACAATGGAG AAGATGATTT ATGCTATGCC AACAGTATTA TTGACTATAA 720 

TGAATTACAC CTCGTGTTGA TTGATTTTGG TTTAGCTATG GACTCTGCTA CCATTTGTTG 780 

TAATTCATGT CGTGGATCGT CATTTTACAT GGCACCAGAA AGAACCACCA ATTATAACAC 84 0 

CCATCGTTTA ATCAACCAAT TAATTGATAT GAATCAATAT GAGTCAATTG AAATCAATGG 900 

GACAACAGTG ACAAAATCAA ACTGTAAATA TTTACCTACA TTGGCTGGGG ATATTTGGTC 960 

ATTGGGAGTA TTGTTCATTA ATATCACTTG TTCAAGAAAC CCATGGCCCA TTGCATCATT 1020 

TGATAATAAT CAAAATAATG AAGTGTTTAA GAATTATATG TTGAATAATA ACAAGGCTGT 108 0 

TTTGAGCAAA ATCTTACCCA TTTCCTCACA ATTTAATCGC TTATTAGATA GAATTTTCAA 114 0 

ATTGAATCCT AATGATAGAA TAGATTTACC AACTTTATAC AAAGJ\AGTTA TTCGTTGTGA 12 00 

TTTCTTCAAA GATGATCATT ACTACTATGC CCAACATCAA CATCATCACA ATCACAATCA 12 60 

AATCAATJAT GCTTACAATC ACTATCAGAA ACAACCTAAT CAAGCAAGAC CTACTGCAAA 1320 

CCAACAATTG TATACACCAC CGGAAACCAC CACTTATAAT TCATACGCTA GTGATATGGA 138 0 

AGAAGATGAA ATTAGTGATG ATGAGTTTTA TTCTGATGAA GAAGATGAAG ATATTGAAGA 14 4 0 

CTATGAAGAG GAAGAGGAAG AGTATTTTGG TAATGAGCAA CAACAACAAC AGCAAGTCAC 1500 

AACAGTGAAT GGTAATTTTG GTCAAGTTAA AGGTACCTGT TATTACGATA CCAAAACCAA 15 60 

AACAACTACA TATATAAAAC CACCAGCTGC ATATACTTTA GAGACGCCTA GTCAAAGTGT 1620 

TGAATACTGT TAAGTTGTAC ACATAAATAA TTAATGACAA TTAATAATAA CGATTAATAA 168 0 

TATAG 1685 
(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 537 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 



Met Leu Asn His Glu Leu Arg Phe He Arg Lys He Gly Ala Gly Thr 

Tyr Gly Leu He Tyr Leu Val Glu Asn He Tyr Thr Lys Gin Gin Phe 
20 25 30 

Ala Ala Lys Met Val Leu Glu Gin Pro Leu Leu Lys Gin Lys Gin Gin 
35 40 45 

Gin Gin Gin Ser His His Gly His Lys Gly Glu Ser Ser Met Asn Lys 

55 60 

Gin He He Ser Gin Glu Phe Tyr Gin Tyr Phe Leu Asn Asn Ser Met 
^5 70 75 go 

Pro Gin Pro Arg Asn Leu Asp Leu Asn Tyr Leu Arg Asp Asn Gly His 
85 90 95 

Asp Cys Pro Phe Leu Thr Glu He Ser Leu His Leu Lys Val His Gin 
lOO 105 110 

His Pro Asn He Ala Thr He His Gin Val Leu Asn He Glu Asp Phe 
115 120 125 

Ala He He He Leu Met Asp His Phe Glu Gin Gly Asp Leu Phe Thr 
130 135 140 

Asn He He Asp Arg Gin He Phe Thr Asn Asn Ser His Arg Lys Val 

150 155 160 

Pro Arg Thr Asp Phe Glu Thr Gin Leu Leu Met Lys Asn Ala Met Leu 
165 170 175 

Gin Leu He Glu Ala He Glu Tyr Cys His Glu Asn Asn He Tyr His 
180 185 190 

Cys Asp Leu Lys Pro Glu Asn He Met Val Arg Tyr Asn Pro Tyr Tyr 
195 200 205 

Val Arg Pro Thr He Asn Asn Asn Asn Asn Asn Gly Glu Asp Asp Leu 
210 215 220 

Cys Tyr Ala Asn Ser He He Asp Tyr Asn Glu Leu His Leu Val Leu 
225 230 235 240 

He Asp Phe Gly Leu Ala Met Asp Ser Ala Thr He Cys Cys Asn Ser 
245 250 255 

Cys Arg Gly Ser Ser Phe Tyr Met Ala Pro Glu Arg Thr Thr Asn Tyr 
260 265 270 

Asn Thr His Arg Leu He Asn Gin Leu He Asp Met Asn Gin Tyr Glu 
275 280 285 

Ser He Glu He Asn Gly Thr Thr Val Thr Lys Ser Asn Cys Lys Tyr 
290 295 300 

Leu Pro Thr Leu Ala Gly Asp He Trp Ser Leu Gly Val Leu Phe He 

310 315 320 

Asn He Thr Cys Ser Arg Asn Pro Trp Pro He Ala Ser Phe Asp Asn 
325 330 335 

Asn Gin Asn Asn Glu Val Phe Lys Asn Tyr Met Leu Asn Asn Asn Lys 
340 345 350 

Ala Val Leu Ser Lys He Leu Pro He Ser Ser Gin Phe Asn Arg Leu 



355 360 365 

Leu Asp Arg lie Phe Lys Leu Asn Pro Asn Asp Arg lie Asp Leu Pro 
370 375 380 

Thr Leu Tyr Lys Glu Val He Arg Cys Asp Phe Phe Lys Asp Asp His 
385 390 395 400 

Tyr Tyr Tyr Ala Gin His Gin His His His Asn His Asn Gin He Asn 
405 410 415 

Asn Ala Tyr Asn His Tyr Gin Lys Gin Pro Asn Gin Ala Arg Pro Thr 
420 425 430 

Ala Asn Gin Gin Leu Tyr Thr Pro Pro Glu Thr Thr Thr Tyr Asn Ser 
435 440 445 

Tyr Ala Ser Asp Met Glu Glu Asp Glu He Ser Asp Asp Glu Phe Tyr 
450 455 460 

Ser Asp Glu Glu Asp Glu Asp He Glu Asp Tyr Glu Glu Glu Glu Glu 
465 470 475 480 

Glu Tyr Phe Gly Asn Glu Gin Gin Gin Gin Gin Gin Val Thr Thr Val 
485 490 495 

Asn Gly Asn Phe Gly Gin Val Lys Gly Thr Cys Tyr Tyr Asp Thr Lys 
500 505 510 

Thr Lys Thr Thr Thr Tyr He Lys Pro Pro Ala Ala Tyr Thr Leu Glu 
515 520 525 

Thr Pro Ser Gin Ser Val Glu Tyr Cys 
530 535 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 848 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
AACCAATTTT AGAAACAATG GCTCGTCAAT TTTTCGTAGG TGGTAACTTC AAAGCTAACG 
GTACCAAACA ACAAATCACT TCAATCATCG ACAACTTGAA CAAGGCTGAT TTACCAAAGG 
ATGTCGAAGT TGTCATTTGT CCACCCGCCC TTTACCTTGG TTTAGCTGTA GAGCAAAACA 
AACAACCAAC TGTTGCCATT GGTGCTCAAA ATGTTTTTGA CAAGTCATGT GGTGCTTTCA 
CTGGTGAAAC CTGTGCTTCT CAAATCTTGG ATGTTGGTGC CAGCTGGACT TTAACTGGTC 
ACAGTGAAAG AAGAACCATT ATCAAAGAAT CCGATGAATT CATTGCTGAA AAAACCAAGT 
TTGCCTTGGA CACTGGTGTC AAAGTTATTT TATGTATTGG TGAAACCTTA GAGGAAAGAA 
AAGGTGGTGT CACTTTGGAT GTTTGTGCCA GACAATTGGA TGCTGTTTCC AAGATTGTTT 
CTGATTGGTC AAACATTGTT GTTGCTTACG AACCTGTTTG GGCAATTGGT ACTGGTTTAG 
CCGCTACCCC AGAAGATGCT GAAGAAACCC ACAAAGGTAT TAGAGCTCAT TTGGCCAAGA 




CCATTGGTGC CGAACAAGCT GAAAAAACCA GAATCTTGTA CGGTGGTTCA GTTAACGGTA 



660 



AGAACGCTAA GGATTTCAAA GACAAAGCAA ATGTTGATGG TTTCTTAGTC GGTGGTGCTT 



720 



CATTAAAACC AGAATTTGTT GATATCATCA AATCTAGATT ATAAACAGTA TATTAAAAAC 



780 



TATATGCCTA TAGAATTTAG CATGTTGTTG TGAATTTGTA ATGAATCTAT AAAAATGTGC 



840 



TCATGAAC 



848 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 248 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: xmknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Met Ala Arg Gin Phe Phe Val Gly Gly Asn Phe Lys Ala Asn Gly Thr 
IS 10 15 

Lys Gin Gin lie Thr Ser He He Asp Asn Leu Asn Lys Ala Asp Leu 
20 25 30 

Pro Lys Asp Val Glu Val Val He Cys Pro Pro Ala Leu Tyr Leu Gly 
35 40 45 

Leu Ala Val Glu Gin Asn Lys Gin Pro Thr Val Ala He Gly Ala Gin 
50 55 60 

Asn Val Phe Asp Lys Ser Cys Gly Ala Phe Thr Gly Glu Thr Cys Ala 
65 70 75 80 

Ser Gin He Leu Asp Val Gly Ala Ser Trp Thr Leu Thr Gly His Ser 
85 90 95 

Glu Arg Arg Thr He He Lys Glu Ser Asp Glu Phe He Ala Glu Lys 
100 105 110 

Thr Lys Phe Ala Leu Asp Thr Gly Val Lys Val He Leu Cys He Gly 
115 120 125 

Glu Thr Leu Glu Glu Arg Lys Gly Gly Val Thr Leu Asp Val Cys Ala 
130 135 140 

Arg Gin Leu Asp Ala Val Ser Lys He Val Ser Asp Trp Ser Asn He 
145 150 155 160 

Val Val Ala Tyr Glu Pro Val Trp Ala He Gly Thr Gly Leu Ala Ala 
165 170 175 

Thr Pro Glu Asp Ala Glu Glu Thr His Lys Gly He Arg Ala His Leu 
180 185 190 

Ala Lys Thr He Gly Ala Glu Gin Ala Glu Lys Thr Arg He Leu Tyr 
195 200 205 

Gly Gly Ser Val Asn Gly Lys Asn Ala Lys Asp Phe Lys Asp Lys Ala 
210 215 220 

Asn Val Asp Gly Phe Leu Val Gly Gly Ala Ser Leu Lys Pro Glu Phe 
225 230 235 240 




Val Asp He He Lys Ser Arg Leu 
245 



^ __ 
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Claims 

!• A nucleic acid molecule encoding a 
polypeptide which is critical for survival and growth 
5 of the yeast Candida albicans and which nucleic acid 

molecule comprises any of the sequences of nucleotides 
in Sequence ID Numbers 1 to 3, 5, 6, 8 to 11, 13, 15, 
16, 18, 20, 21, 23, 25 to 29, 31, 35, 37, 39, 41, 43, 
45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69 and 
10 71, 

2. A nucleic acid molecule encoding a 
polypeptide which is critical for survival and growth 
of the yeast Candida albicans and which nucleic acid 

15 molecule comprises any of the sequences of nucleotides 
in Sequence ID Numbers 28, 35, 37 and 39 and fragments 
or derivatives of said nucleic acid molecules • 

3. A nucleic acid molecule encoding a 

20 polypeptide which is critical for survival and growth 
of the yeast Candida albicans and which polypeptide 
has an amino acid sequence according to the sequence 
of any of Sequence ID Numbers 4, 7, 12, 14, 17, 19, 
22, 24, 30, 32 to 34, 36, 38, 40, 42, 44, 46, 48, 50, 

25 52, 54, 56, 58, 60, 62, 64, 66, 68, 70 and 72. 

4. A nucleic acid molecule according to any of 
claims 1 to 3 which is mRNA. 

30 5, A nucleic acid molecule according to any of 

claims 1 to 3 which is DNA. 

6, A nucleic acid molecule according to claim 5 
which is cDNA. 

35 



X-;-:-:-:-- 

m 



?• A nucleic acid molecule capable of 
hybridising to the molecules according to any of 
claims 1 to 5 under high stringency conditions. 

8- A polypeptide having the amino acid 
sequences of any of Sequence ID Numbers 4, 7, 12, 14, 
17, 19, 22, 24, 30, 32 to 34, 36, 38, 40, 42, 44, 46, 
48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70 and 72 

9o A polypeptide encoded by the nucleic acid 
molecule according to any of claims 1 to 6, 



10 • A polypeptide according to claim 9 having 
amino acid sequence of any of Sequence ID Numbers 4, 
7, 12, 14, 17, 19, 22, 24, 30, 32 to 34, 36, 38, 40, 
42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 
68, 70 and 72. 



11 o An expression vector comprising a nucleic 
20 acid molecule according to claim 5 or 6. 

12 o An expression vector according to claim 11 
which comprises an inducible promoter. 

25 13 o An expression vector according to claim 11 

or 12 which comprises a sequence encoding a reporter 
molecule* 

14, A nucleic acid molecule according to any of 
3 0 claims 1 to 7 for use as a medicament. 

15. Use of a nucleic acid molecule according to 
any of claims 1 to 7 in the preparation of a 
medicament for treating Candida albicans associated 

35 diseases. 
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16. A polypeptide according to any of claims 8 
or 10 for use as ^ medicament. 

17 . Use of a polypeptide according to any of 

5 claims 8 to 10 in the preparation of a medicament for 
treating Candida, albicans associated infections. 

18. A pharmaceutical composition comprising a 
nucleic acid molecule according to any of claims 1 to 

10 7 or a polypeptide according to any of claims 8 to 10 
together with a pharmaceutically acceptable carrier 
diluent or excipient therefor. 

19. A Candida albicans cell comprising an 
15 induced mutation in the DNA sequence encoding the 

polypeptide according to any of claims 8 to 10. 

20. A method of identifying compounds which 
selectively modulate expression of polypeptides which 

2 0 are crucial for growth and survival of Candida 

albicans , which method comprises: 

(a) contacting a compound to be tested with one 
or more Candida albicans cells having a 
mutation in a nucleic acid molecule 
25 according to any of claims 1 to 6 which 

mutation results in overexpression or 
underexpression of said polypeptides in 
addition to contacting one or more wild type 
Candida albicans cells with said compound, 

3 0 (b) monitoring the growth and/or activity of 

said mutated cell compared to said wild 
type; wherein differential growth or 
activity of said one or more mutated Candida 
cells is indicative of selective action of 
3 5 said compound on a polypeptide or another 



f?rimed:20-lQ-:t999 
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-polypeptide in the same or a parallel 
pathway . 

21. A compound identifiable according to the 
5 method of claim 20. 

22. A compound according to claim 21 for use as 
a medicament. 

0 23. Use of a compound according to claim 21 in 
the preparation of a medicament for treating Candida 
albicans associated diseases. 

24. A pharmaceutical composition comprising a 
5 compound according to claim 21 together with a 

pharmaceutically acceptable carrier, diluent or 
excipient therefor. 

25. A method of identifying DNA sequences from a 
) cell or organism which DNA encodes polypeptides which 

are critical for growth or survival of said cell or 
organism, which method comprises: 

(a) preparing a cDNA or genomic library from 
said cell or organism in a suitable 

1 expression vector which vector is such that 
it can either integrate into the genome in 
said cell or that it permits transcription 
of antisense RNA from the nucleotide 
sequences in said cDNA or genomic library, 

► (b) selecting transf ormants exhibiting impaired 

growth and determining the nucleotide 
sequence of the cDNA or genomic sequence 
from the library included in the vector from 
said transf ormant. 



- 104 - 
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26. A method according to claim 25 wherein said 
cell or organism is a yeast or filamentous fungi. 

27. A method according to claim 25 or 26 wherein 
said cell or organism is any of SaccharomycBS 
cervisiae , SaccharomycBs pombe or Candida albicans* 

28. Plasmid pGALlPSiST-1 having the sequence of 
nucleotides illustrated in Figure 2. 

29. Plasmid pGALlPNiST-1 having the sequence of 
nucleotides illustrated in Figure 4. 

30. An antibody capable of binding to a 
15 polypeptide according to any of claims 8 or 10. 

31. An oligonucleotide comprising a fragment of 
from 10 to 50 contiguous nucleic acid sequences of a 
nucleic acid molecule according to any of claims 1 to 

20 7. 

32. A nucleic acid molecule encoding a 
polypeptide which is critical for survival and growth 
of the yeast Candida albicans, said nucleic acid 

25 molecule comprising the sequences of any of the 

nucleotide sequences illustrated in Figures 5 to 28. 

33. A polypeptide which is critical for survival 
and growth of the yeast Candida albicans, said 

3 0 polypeptide comprising the amino acid sequences of any 
of the sequences illustrated in Figures 29 to 39. 



Iff 



CYCT 
CYCT primer 
flamHI(6899) . • 




Hindlll 



1 AGCTTGAGTA TTCTATAGTG TCACCTAAAT AGCTTGGCGT AATCATGGTC 
TCGAACTCAT AAGATATCAC AGTGGATTTA TCGAACCGCA TTAGTACCAG 



51 ATAGCTGTTT CCTGTGTGAA ATTGTTATCC GCTCACAATT CCACACAACA 
TATCGACAAA GGACACACTT TAACAATAGG CGAGTGTTAA GGTGTGTTCT 



101 TACGAGCCGG AAGCATAAAG TGTAAAGCCT GGGGTGCCTA ATGAGTGAGC 
ATGCTCGGCC TTCGTATTTC ACATTTCGGA CCCCACGGAT TACTCACTCXS 



151 TAACTCACAT TAATTGCGTT GCGCTCACTG CCCGCTTTCC AGTCGGGAAA 
ATTGAGTGTA ATTAACGCAA CGCGAGTGAC GGGCGAAAGG TCAGCCCTTT 



201 CCTGTCGTGC CAGCTGCATT AATGAATCGG CCAACGCGCG GGGAGAGGCG 
GGACAGCACG GTCGACGTAA TTACTTAGCC GGTTGCGCGC CCCTCTCCGC 



251 GTTTGCGTAT TGGGCGCTCT TCCGCTTCCT CGCTCACTGA CTCGCTGCGC 
CAAACGCATA ACCCGCGAGA AGGCGAAGGA GCXSAGTGACT GAGCGACX3CG 



301 TCGGTCGTTC GGCTGCGGCG AGCGGTATCA GCTCACTCAA AGGCGGTAAT 
AGCCAGCAAG CCGACGCCGC TCGCCATAGT CGAGTGAGTT TCCGCCATTA 



351 ACGGTTATCC ACAGAATCAG GGGATAACGC AGGAAAGAAC ATGTGAGCAA 
TGCCAATAGG TGTCTTAGTC CCCTATTGCG TCCTTTCTTG TACACTCGTT 



401 AAGGCCAGCA AAAGGCCAGG AACCGTAAAA AGGCCGCGTT GCTGGCGTTT 
TTCCGGTCGT TTTCCGGTCC TTGGCATTTT TCCGGCGCAA CGACXZGCAAA 



451 TTCCATAGGC TCCGCCCCCC TGACGAGCAT CACAAAAATC GACGCTCAAG 
AAGGTATCCG AGGCGGGGGG ACTGCTCGTA GTGTTTTTAG CTGCGAGTTC 



501 TCAGAGGTGG CGAAACCCGA CAGGACTATA AAGATACCAG GCGTTTCCCC 
AGTCTCCACC GCTTTGGGCT GTCCTGATAT TTCTATGGTC CGCAAAGGGG 



551 CTGGAAGCTC CCTCGTGCGC TCTCCTGTTC CGACCCTGCC GCTTACCGGA 
GACCTTCGAG GGAGCACGCG AGAGGACAAG GCTGGGACGG CGAATGGCCT 



601 TACCTGTCCG CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT CTCATAGCTC 
ATGGACAGGC GGAAAGAGGG AAGCCCTTCG CACCGCGAAA GAGTATCGAG 



651 ACGCTGTAGG TATCTCAGTT CGGTGTAGGT CGTTCGCTCC AAGCTGGGCT 
TGCGACATCC ATAGAGTCAA GCCACATCCA GCAAGCGAGG TTCGACCCGA 



ApaLI 



701 GTGTGCACGA ACCCCCCGTT CAGCCCGACC GCTGCGCCTT ATCCGGTAAC 
CACACGTGCT TGGGGGGCAA GTCGGGCTGG CGACGCGGAA TAGGCCATTG 



751 TATCGTCTTG AGTCCAACCC GGTAAGACAC GACTTATCGC CACTGGCAGC 
ATAGCAGAAC TCAGGTTGGG CCATTCTGTG CTGAATAGCG GTGACCGTCG 



801 AGCCACTGGT AACAGGATTA GCAGAGCGAG GTATGTAGGC GGTGCTACAG 
TCGGTGACCA TTGTCCTAAT CGTCTCGCTC CATACATCCG CCACGATGTC 



851 AGTTCTTGAA GTGGTGGCCT AACTACGGCT ACACTAGAAG GACAGTATTT 
TCAAGAACTT CACCACCGGA TTGATGCCGA TGTGATCTTC CTGTCATAAA 



901 GGTATCTGCG CTCTGCTGAA GCCAGTTACC TTCGGAAAAA GAGTTGGTAG 
CCATAGACGC GAGACGACTT CGGTCAATGG AAGCCTTTTT CTCAACCATC 



-;v: J'l^^i >V"9 8 16:24:28 
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CTCTTGATCC GGCAAACAAA CCACCGCTGG TAGCGGTGGT TTTTTTGTTT 
GAGAACTAGG CCGTTTGTTT GGTGGCGACC ATCGCCACCA AAAAAACAAA 



1001 GCAAGCAGCA GATTACGCGC AGAAAAAAAG GATCTCAAGA AGATCCTTTG 
CGTTCGTCGT CTAATGCGCG TCTTTTTTTC CTAGAGTTCT TCTAGGAAAC 



1051 ATCTTTTCTA CGGGGTCTGA CGCTCAGTGG AACGAAAACT CACGTTAAGG 
TAGAAAAGAT GCCCCAGACT GCGAGTCACC TTGCTTTTGA GTGCAATTCC 



1101 GATTTTGGTC ATGAGATTAT CAAAAAGGAT CTTCACCTAG ATCCTTTTAA 
CTAAAACCAG TACTCTAATA GTTTTTCCTA GAAGTGGATC TAGQAAAATT 



1151 ATTAAAAATG AAGTTTTAAA TCAATCTAAA GTATATATGA GTAAACTTGG 
TAATTTTTAC TTCAAAATTT AGTTAGATTT CATATATACT CATTTGAACC 



1201 TCTGACAGTT ACCAATGCTT AATCAGTGAG GCACCTATCT CAGCGATCTG 
AGACTGTCAA TGGTTACGAA TTAGTCACTC CGTGGATAGA GTCGCTAGAC 



1251 TCTATTTCGT TCATCCATAG TTGCCTGACT CCCCGTCGTG TAGATAACTA 
AGATAAAGCA AGTAGGTATC AACGGACTGA GGGGCAGCAC ATCTATTGAT 



1301 CGATACGGGA GGGCTTACCA TCTGGCCCCA GTGCTGCAAT GATACCGCGA 
GCTATGCCCT CCCGAATGGT AGACCGGGGT CACGACGTTA CTATGGCGCT 



1351 GACCCACGCT CACCGGCTCC AGATTTATCA GCAATAAACC AGCCAGCCGG 
CTGGGTGCGA GTGGCCGAGG TCTAAATAGT CGTTATTTGG TCGGTCGGCC 



1401 AAGGGCCGAG CGCAGAAGTG GTCCTGCAAC TTTATCCGCC TCCATCCAGT 
TTCCCGGCTC GCGTCTTCAC CAGGACGTTG AAATAGGCGG AGGTAGGTCA 



1451 CTATTAATTG TTGCCGGGAA GCTAGAGTAA GTAGTTCGCC AGTTAATAGT 
GATAATTAAC AACGGCCCTT CGATCTCATT CATCAAGCGG TCAATTATCA 



1501 TTGCGCAACG TTGTTGCCAT TGCTACAGGC ATCGTGGTGT CACGCTCGTC 
AACGCGTTGC AACAACGGTA ACGATGTCCG TAGCACCACA GTGCGAGCAG 



1551 GTTTGGTATG GCTTCATTCA GCTCCGGTTC CCAACGATCA AGGCGAGTTA 
CAAACCATAC CGAAGTAAGT CGAGGCCAAG GGTTGCTAGT TCCGCTCAAT 



1601 CATGATCCCC CATGTTGTGC AAAAAAGCGG TTAGCTCCTT CGGTCCTCCG 
GTACTAGGGG GTACAACACG TTTTTTCGCC AATCGAGGAA GCCAGGAGGC 



1651 ATCGTTGTCA GAAGTAAGTT GGCCGCAGTG TTATCACTCA TGGTTATGGC 
TAGCAACAGT CTTCATTCAA CCGGCGTCAC AATAGTGAGT ACCAATACCG 



1701 AGCACTGCAT AATTCTCTTA CTGTCATGCC ATCCGTAAGA 

TCGTGACGTA TTAAGAGAAT GACAGTACGG TAGGCATTCT ACGAAAAGAC 



1751 TGACTGGTGA GTACTCAACC AAGTCATTCT GAGAATAGTG TATGCGGCGA 
ACTGACCACT CATGAGTTGG TTCAGTAAGA CTCTTATCAC ATACGCCGCT 



1801 CCGAGTTGCT CTTGCCCGGC GTCAATACGG GATAATACCG CGCCACATAG 
GGCTCAACGA GAACGGGCCG CAGTTATGCC CTATTATGGC GCGGTGTATC 



1851 CAGAACTTTA AAAGTGCTCA TCATTGGAAA ACGTTCTTCG GGGCGAAAAC 
GTCTTGAAAT TTTCACGAGT AGTAACCTTT TGCAAGAAGC CCCGCTTTTG 



m 



1901 TCTCAAGGAT CTTACCGCTG TTGAGATCCA GTTCGATGTA ACCCACTCGT 
AGAGTTCCTA GAATGGCGAC AACTCTAGGT CAAGCTACAT TGGGTGAGCA 



ApaLI 

1951 GCACCCAACT GATCTTCAGC ATCTTTTACT TTCACCAGCG TTTCTGGGTC 
CGTGGGTTGA CTAGAAGTCG TAGAAAATX3A AAGTGGTCGC AAAGACCCAC 



2001 AGCAAAAACA GGAAGGCAAA ATGCCGCAAA AAAGGGAATA AGGGCGACAC 
TCGTTTTTGT CCTTCCGTTT TACGGCGTTT TTTCCCTTAT TCCCGCTGTC 



2051 GGAAATGTTG AATACTCATA CTCTTCCTTT TTCAATATTA TTGAAGCATT 
CCTTTACAAC TTATGAGTAT GAGAAGGAAA AAGTTATAAT AACTTCGTAA 



2101 TATCAGGGTT ATTGTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA 
ATAGTCCCAA TAACAGAGTA CTCGCCTATG TATAAACTTA CATAAATCTT 



2151 AAATAAACAA ATAGGGGTTC CGCGCACATT TCCCCGAAAA GTGCCACCTG 
TTTATTTGTT TATCCCCAAG GCGCGTGTAA AGGGGCTTTT CACGGTGGAC 



2201 ACGTCTAAGA AACCATTATT ATCATGACAT TAACCTATAA AAATAGGCGT 
TGCAGATTCT TTGGTAATAA TAGTACTGTA ATTGGATATT TTTATCCGCA 



2251 ATCACGAGGC CCTTTCGTCT CGCGCGTTTC GGTGATGACG GTGAAAACCT 
TAGTGCTCCG GGAAAGCAGA GCGCGCAAAG CCACTACTGC CACTTTTGGA 



2301 CTGACACATG CAGCTCCCGG AGACGGTCAC AGCTTGTCTG TAAGCGGATG 
GACTGTGTAC GTCGAGGGCC TCTGCCAGTG TCGAACAGAC ATTCGCCTAC 



2351 CCGGGAGCAG ACAAGCCCGT CAGGGCGCGT CAGCGGGTGT TGGCGGGTGT 
GGCCCTCGTC TGTTCGGGCA GTCCCGCGCA GTCGCCCACA ACCGCCCACA 



ApaLI 



2401 CGGGGCTGGC TTAACTATGC GGCATCAGAG CAGATTGTAC TGAGAGTGCA 
GCCCCGACCG AATTGATACG CCGTAGTCTC GTCTAACATG ACTCTCACGT 



ApaLI 

2451 CCATATGCGG TGTGAAATAC CGCACAGATG CGTAAGGAGA AAATACCGCA 
GGTATACGCC ACACTTTATG GCGTGTCTAC GCATTCCTCT TTTATGGCGT 



2501 TCAGGCGAAA TTGTAAACGT TAATATTTTG TTAAAATTCX; CGTTAAATAT 
AGTCCGCTTT AACATTTGCA ATTATAAAAC AATTTTAAGC GCAATTTATA 



2551 TTGTTAAATC AGCTCATTTT TTAACCAATA GGCCGAAATC GGCAAAATCC 
AACAATTTAG TCGAGTAAAA AATTGGTTAT CCGGCTTTAG CCGTTTTACX5 



2601 CTTATAAATC AAAAGAATAG ACCGAGATAG GGTTGAGTGT TGTTCCAGTT 
GAATATTTAG TTTTCTTATC TGGCTCTATC CCAACTCACA ACAAGGTCAA 



2651 TGGAACAAGA GTCCACTATT AAAGAACGTG GACTCCAACG TCAAAGGGCG 
ACCTTGTTCT CAGGTGATAA TTTCTTGCAC CTGAGGTTCC AGTTTCCCGC 



2701 AAAAACCGTC TATCAGGGCG ATGGCCCACT ACGTGAACCA TCACCCAAAT 
TTTTTGGCAG ATAGTCCCGC TACCGGGTGA TGCACTTGGT AGTGGGTTTA 



2751 CAAGTTTTTT GCGGTCGAGG TGCCGTAAAG CTCTAAATCG GAACCCTAAA 
GTTCAAAAAA CGCCAGCTCC ACGGCATTTC GAGATTTAGC CTTGGGATTT 



2801 GGGAGCCCCC GATTTAGAGC TTGACGGGGA AAGCCGGCGA ACGTGGCGAG 
CCCTCGGGGG CTAAATCTCG AACTGCCCCT TTCGGCCGCT TGCACCGCTC 



2851 AAAGGAAGGG AAGAAAGCGA AAGGAGCGGG CGCTAGGGCG CTGGCAAGTG 
TTTCCTTCCC TTCTTTCGCT TTCCTCGCCC GCGATCCCGC GACCGTTCAC 



2901 TAGCGGTCAC GCTGCGCGTA ACCACCACAC CCGCCGCGCT TAATGCGCCG 
ATCGCCAGTG CGACGCGCAT TGGTGGTGTG GGCGGCGCX3A ATTACGCGGC 



2951 CTACAGGGCG CGTCCATTCG CCATTCAGGC TGCGCAACTG TTGGGAAGGG 
GATGTCCCGC GCAGGTAAGC GGTAAGTCCG ACGCGTTGAC AACCCTTCCC 



3001 CGATCGGTGC GGGCCTCTTC GCTATTACGC CAGCTGGCGA AAGGGGGATG 
GCTAGCCACG CCCGGAGAAG CGATAATGCG GTCGACCGCT TTCCCCCTAC 



3051 TGCTGCAAGG CGATTAAGTT GGGTAACGCC AGQGTTTTCC CAGTCACGAC 
ACGACGTTCC GCTAATTCAA CCCATTGCGG TCCCAAAAGG GTCAGTGCTG 



3101 GTTGTAAAAC GACGGCCAGT GAATTGTAAT ACGACTCACT ATAGGGCGAA 
CAACATTTTG CTGCCGGTCA CTTAACATTA TGCTGAGTGA TATCCCGCTT 



3151 TTGGTTTTCC AATGATGAGC ACTTTTAAAG TTCTGCTATG TGGCGCGGTA 
AACCAAAAGG TTACTACTCG TGAAAATTTC AAGACGATAC ACCGCGCCAT 



3201 TTATCCCGTG TTGACGCCGG GCAAGAGCAA CTCGGTCGCC GCATACACTA 
AATAGGGCAC AACTGCGGCC CGTTCTCGTT GAGCCAGCX3G CGTATGTGAT 



3251 TTCTCAGAAT GACTTGGTTG AGTACTAATA GGAATTGATT TGGATGGTAT 
AAGAGTCTTA CTGAACCAAC TCATGATTAT CCTTAACTAA ACCTACCATA 



3301 AAACGGAAAC AAAAAAAAGA GCTGGTACTA CTTTCTTTAA AATTATTTTA 
TTTGCCTTTG TTTTTTTTCT CGACCATGAT GAAAGAAATT TTAATAAAAT 



3351 TTATTTGATT TTATTTAATA GTATATATTA TATTTTGAAC GTAGATTATT 
AATAAACTAA AATAAATTAT CATATATAAT ATAAAACTTG CATCTAATAA 



3 401 TTGTTGAAAG TTGCTGTAGT GCCATTGATT CGTAACACTA ATTCTGTATT 
AACAACTTTC AACGACATCA CGGTAACTAA GCATTGTGAT TAAGACATAA 



3451 AGTCATTCCT CTTGTTTGAT AGTATCCAAA AAAACGGCTA TTTTTTTGCA 
TCAGTAAGGA GAACAAACTA TCATAGGTTT TTTTGCCGAT AAAAAAACGT 



3501 ATCTTATTTC CTGCATATTA TACAGATAAC ATAATGAAAG AAAAAATCTT 
TAGAATAAAG GACGTATAAT ATGTCTATTG TATTACTTTC TTTTTTAGAA 



3551 TTTTTTTGTT CTTCAATGAT GATTTCAACC ATTCTTTTAA ACATTGATCA 
AAAAAAACAA GAAGTTACTA CTAAAGTTGG TAAGAAAATT TGTAACTAGT 



3601 ATTCCTGAGC AACAACCCCA TACACACTGG TTTATATACC GCCCCTTTTA 
TAAGGACTCG TTGTTGGGGT ATGTGTGACC AAATATATGG CGGGGAAAAT 



3651 CAGTTGAAGA AAGAAATAGA AATAGAAATA GCAAACAAAA GATATGACAG 
GTCAACTTCT TTCTTTATCT TTATCTTTAT CGTTTGTTTT CTATACTGTC 



3701 TCAACACTAA GACCTATAGT GAGAGAGCAG AAACTCATGC CTCACCAGTA 
AGTTGTGATT CTGGATATCA CTCTCTCGTC TTTGAGTACG GAGTGGTCAT 



3751 GCACAGCGAT TATTTCGATT AATGGAACTG AAGAAAACCA ATTTATGTGC 
CGTGTCGCTA ATAAAGCTAA TTACCTTGAC TTCTTTTGGT TAAATACACG 



EcoRI 



3801 ATCAATTGAC GTTGATACCA CTAAGGAATT CCTTGAATTA ATTGATAAAT 
TAGTTAACTG CAACTATGGT GATTCCTTAA GGAACTTAAT TAACTATTTA 



3851 TAGGTCCTTA TGTATGCTTA ATCAAGACTC ATATTGATAT AATCAATGAT 
ATCCAGGAAT ACATACGAAT TAGTTCTGAG TATAACTATA TTAGTTACTA 



3901 TTTTCCTATG AATCCACTAT TGAACCATTA TTAGAACTTT CACGTAAACA 
AAAAGGATAC TTAGGTGATA ACTTGGTAAT AATCTTGAAA GTGCATTTGT 



3951 TCAATTTATG ATTTTTGAAG ATAGAAAATT TGCTGATATT GGTAATACCG 
AGTTAAATAC TAAAAACTTC TATCTTTTAA ACGACTATAA CCATTATGGC 



4001 TAAAGAAACA ATATATTGGT GGAGTTTATA AAATTAGTAG TTGGGCAGAT 
ATTTCTTTGT TATATAACCA CCTCAAATAT TTTAATCATC AACCCGTCTA 



4051 ATTACCAATG CTCATGGTGT CACTGGGAAT GGAGTGGTTG AAGGATTAAA 
TAATGGTTAC GAGTACCACA GTGACCCTTA CCTCACCAAC TTCCTAATTT 



4101 ACAGGGAGCT AAAGAAACCA CCACCAACCA AGAGCCAAGA GGGTTATTGA 
TGTCCCTCGA TTTCTTTGGT GGTGGTTCGT TCTCGGTTCT CCCAATAACT 



4151 TGTTAGCTGA ATTATCATCA GTGGGATCAT TAGCATATGG AGAATATTCT 
ACAATCGACT TAATAGTAGT CACCCTAGTA ATCGTATACC TCTTATAAGA 



4201 CAAAAAACTG TTGAAATTGC TAAATCCGAT AAGGAATTTG TTATTGGATT 
GTTTTTTGAC AACTTTAACG ATTTAGGCTA TTCCTTAAAC AATAACCTAA 



4251 TATTGCCCAA CGTGATATGG GTGGCCAAGA AGAAGGATTT GATTGGCTTA 
ATAACGGGTT GCACTATACC CACCGGTTCT TCTTCCTAAA CTAACCGAAT 



4301 TTATGACACC TGGAGTTGGA TTAGATGATA AAGGTGATGG ATTAGGACAA 
AATACTGTGG ACCTCAACCT AATCTACTAT TTCCACTACC TAATCCTGTT 



4351 CAATATAGAA CTGTTGATGA AGTTGTTAGC ACTGGAACTG ATATTATCAT 
GTTATATCTT GACAACTACT TCAACAATCG TGACCTTGAC TATAATAGTA 



4401 TGTTGGTAGA GGATTGTTTG GTAAAGGAAG AGATCCAGAT ATTGAAGGTA 
ACAACCATCT CCTAACAAAC CATTTCCTTC TCTAGGTCTA TAACTTCCAT 



4451 AAAGGTATAG AAATGCTGGT TGGAATGCTT ATTTGAAAAA GACTGGCCAA 
TTTCCATATC TTTACGACCA ACCTTACGAA TAAACTTTTT CTGACCGGTT 



4501 TTATAAATGT GAAGGGGGAG ATTTTCACTT TATTAGATTT GTATATATGT 
AATATTTACA CTTCCCCCTC TAAAAGTGAA ATAATCTAAA CATATATACA 



4551 AGAATAAATA AATAAATAAG TTAAATAAAT AATTAAATAA GGGTGGTAAT 
TCTTATTTAT TTATTTATTC AATTTATTTA TTAATTTATT CCCACCATTA 



4601 TATTACTATT TACAATCAAA GGTGGTCCTT CTAGCTGTAA TCCGGGCAGC 
ATAATGATAA ATGTTAGTTT CCACCAGGAA GATCGACATT AGGCCCGTCG 



4651 GCAACGGAAC ATTCATCAG? GTAAAAATGG AATCAATAAA GCCCTGCGCA 
CGTTGCCTTG TAAGTAGTCA CATTTTTACC TTAGTTATTT CGGGACGCGT 



4701 GCGCGCAGGG TCAGCCTGAA TACGCGTTTA ATGACCAGCA CAGTCGTGAT 
CGCGCGTCCC AGTCGGACTT ATGCGCAAAT TACTGGTCGT GTCAGCACTA 



4751 GGCAAGGTCA GAATAGCCCA AGTCGGCCGA GGGGCCTGTA CAGTGAGGGA 
CCGTTCCAGT CTTATCGGGT TCAGCCGGCT CCCCX3GACAT GTCACTCCCT 



4801 AGATCTGATA TTGACGAAGA GGAACCAATG TAACGTTACA CTGAAGAAAA 
TCTAGACTAT AACTGCTTCT CCTTGGTTAC ATTGCAATGT GACTTCTTTT 



4851 CACACAATAA ACGGGAAGAA ACGGTGTAAA AGTGTGAAAA TAATTTTTGA 
GTGTGTTATT TGCCCTTCTT TGCCACATTT TCACACTTTT ATTAAAAACT 



4901 ATATCATTTC CCTTGGTTTA ATTCCAAACG AAACGTGTTT TTTTTAGAGA 
TATAGTAAAG GGAACCAAAT TAAGGTTTGC TTTGCACAAA AAAAATCTCT 



EcoRI .^>aLI 



4951 ATGGGAATTC TTATTGGATG TCTAGATTGT TTGTTTACTC CAGACTGTGC 
TACCCTTAAG AATAACCTAC AGATCTAACA AACAAATGAG GTCTGACACG 



5001 ACAAAAACGT TTGGATGGAT GATCAGAAGA TATTTTTAGG CTTAGCTCTA 
TGTTTTTGCA AACCTACCTA CTAGTCTTCT ATAAAAATCC GAATCGAGAT 



5051 AATATAAGAA ATGATGCTTG AAAAACCAGA CAGAAATTGA GTTTCAAAAA 
TTATATTCTT TACTACGAAC TTTTTGGTCT GTCTTTAACT CAAAGTTTTT 



5101 TTGGTAATGT GAGGTATTAG TCAACTAACC AAATAACAAT GCAAACCGGT 
AACCATTACA CTCCATAATC AGTTGATTGG TTTATTGTTA CGTTTGGCCA 



5151 TGATACATTT CATTTTGAAA ATAATGAAAC TGGAATTGGA TGACCAGCAC 
ACTATGTAAA GTAAAACTTT TATTACTTTG ACCTTAACCT ACTGGTCGTG 



5201 ACAAACACAT AAAGTAATTA TGGGAATTAG AAGCGAACAT AGAGGAGTAC 
TGTTTGTGTA TTTCATTAAT ACCCTTAATC TTCGCTTGTA TCTCCTCATG 



5251 TTGGCCACGA ACAGAATACA AGTGGGAACA CTATTTTCTC CATTGTTTTA 
AACCGGTGCT TGTCTTATGT TCACCCTTGT GATAAAAGAG GTAACAAAAT 



5301 GTTCTGTTTT TTTGTCAGCC TAGTTTTGTG CTATGTGTAA AAAATATTGC 
CAAGACAAAA AAACAGTCGG ATCAAAACAC GATACACATT TTTTATAACG 



Hindlll 



5351 CAAGAAAAAA AGCTTGTTTT GTGGCCAGTG TCCGAAAAAA ATTTTGGGGA 
GTTCTTTTTT TCGAACAAAA CACCGGTCAC AGGCTTTTTT TAAAACCCCT 



5401 ATCTTCGGAT TAATTTATGT TTTCATTCCA TCGGGGAAAG TGGGGGGGAA 
TAGAAGCCTA ATTAAATACA AAAGTAAGGT AGCCCCTTTC ACCCCCCCTT 



5451 AAAATTTTAA GCAGTTCACA AAACCTTCCA AAAAATATAT GGACAAAGAT 
TTTTAAAATT CGTCAAGTGT TTTGGAAGGT TTTTTATATA CCTGTTTCTA 



5501 GATTGTATTT TCCCGACACC AAAATCATAA TTAATTATGA GAAAGTTAAA 
CTAACATAAA AGGGCTGTGG TTTTAGTATT AATTAATACT CTTTCAATTT 



5551 TGTAACGTTA CAATTTATGT TTATTTGAAG GTGAAAAGCG ATTTATGATT 
ACATTGCAAT GTTAAATACA AATAAACTTC CACTTTTCGC TAAATACTAA 



5601 TTTCCGAAAT GAAAATTTTT TTTAGGTTTA TTTTTTTTGT CGGGCAAAGA 
AAAGGCTTTA CTTTTAAAAA AAATCCAAAT AAAAAAAACA GCCCGTTTCT 



EcoRI 



5651 AAAACTGAAC AAGGATTATT AAAATTTTTG GTGTTTGTTT GTGTCTGGAG 
TTTTGACTTG TTCCTAATAA TTTTAAAAAC CACAAACAAA CACAGACCTC 



EcoRI 



5701 AATTCATTCC TCTCTCATCT TCACACAATG TTTAGACATC TGACACGATT 
TTAAGTAAGG AGAGAGTAGA AGTGTGTT AC AAATCTGTAG ACTGTGCTAA 



5751 CATGATAGTT CGGTTTCCGG GGTTGGTGTT TAGTTTTCGT TTTTCTTTTT 
GTACTATCAA GCCAAAGGCC CCAACCACAA ATCAAAAGCA AAAAGAAAAA 



5801 TTTTGGAAAG AATGTTTTAG CTCATTGGTT TTCTTTCTTC ATTCAATAGT 
AAAACCTTTC TTACAAAATC GAGTAACCAA AAGAAAGAAG TAAGTTATCA 



5851 TTTGAAAGAA TTTGCCCACT TGTTATTACA ATCATATAAA ATTAAACTTT 
AAACTTTCTT AAACGGGTGA ACAATAATGT TAGTATATTT TAATTTGAAA 



5901 GATATAAAAT AGAGTTTGAA AGTTTCCCAG ATCCTTTTTG ATTTCTTTGT 
CTATATTTTA TCTCAAACTT TCAAAGGGTC TAGGAAAAAC TAAAGAAACA 



5951 AAATTTTTTT TTCTCCCACA TATACACACA TACAAACCGA TTTTTATAAG 
TTTAAAAAAA AAGAGGGTGT ATATGTGTGT ATGTTTGGCT AAAAATATTC 



PstI Aval BamHI 



6001 AAAGAGTTAT ACCCTGCAGC TCGACCTCGA GGGATCCGGG CCCTCTAGAT 
TTTCTCAATA TGGGACGTCG AGCTGQAGCT CCCTAGGCCC GGGAGATCTA 



Aval 



6051 GCGGCCGCTA GGCCTCGAGG GACTTTTGCA CCAAAAATAA TTTATTTTCC 
CGCCGGCGAT CCGGAGCTCC CTGAAAACGT GGTTTTTATT AAATAAAAGG 



6101 AAAATAAAAT TTAAATAAAT AAAAATAACT CATAATTTAA TAAAAATTTC 
TTTTATTTTA AATTTATTTA TTTTTATTGA GTATTAAATT ATTTTTAAAG 



6151 AAAATCTTCT AGTGTCCTTT CATATGCAGT ACATTAGCCA TCAGTCACTT 
TTTTAGAAGA TCACAGGAAA GTATACGTCA TGTAATCGGT AGTCAGTGAA 



6201 AAACAGCATC TGCTGGTTGA AGAATGCTTG AAGCAATTGT CCAGTCCCAG 
TTTGTCGTAG ACGACCAACT TCTTACGAAC TTCGTTAACA GGTCAGGGTC 



6251 AGGCACAGGC TAGGAGATCT TCAGTTTCGG AGGTAACCTG TAAGTCTGTT 
TCCGTGTCCG ATCCTCTAGA AGTCAAAGCC TCCATTGGAC ATTCAGACAA 



6301 AATGAAGTAA AAGTTCCTTA Ga^TTTCCAC TCTGACTATG GTCCAGGCAC 
TTACTTCATT TTCAAGGAAT CCTAAAGGTG AGACTGATAC CAGGTCCGTC 



6351 AGTGACTGTA CTCCTTGGCC TTCAGGTAAT GCAGAATCCT CCCATAATAT 
TCACTGACAT GAGGAACCGG AAGTCCATTA CGTCTTAGGA GGGTATTATA 



6401 CTTTTCAGGT GCAGACTGCT CATGAGTTTT CCCCTGGTGA AATCTTCTTT 
GAAAAGTCCA CGTCTGACGA GTACTCAAAA GGGGACCACT TTAGAAGAAA 



6451 CTCCAGTTTT TCTTCCAGGA CTGTCTTCAG ATGGTTTATC TGATGATAGA 
GAGGTCAAAA AGAAGGTCCT GACAGAAGTC TACCAAATAG ACTACTATCT 



6501 CATTAGCCAG GAGGTTCTCA ACAATAGTCT CATTCCAGCC AGTGCTAGAT 
GTAATCGGTC CTCCAAGii.GT 7GTTATCAGA GTAAGGTCGG TCACGATCTA 



6551 GAATCTTGTC TGAAAATAGC AAAGATGTTC TGGAGCATCT CATAGATGGT 
CTTAGAACAG ACTTTTATCG TTTCTACAAG ACCTCGTAGA GTATCTACCA 



PstI 



6601 


CAATGCGGCG 


TCCTCCTTCT 


GGAACTGCTG 


CAGCTGCTTA 


ATCTCCTCAG 




GTTACGCCGC 


AGGAGGAAGA 


CCTTGACGAC 


GTCGACGAAT 


TAGAGGAGTC 


6651 


GGATGTCAAA 


GTTCATCCTG 


TCCTTGAGGC 


AGTATTCAAG 


CCTCCCATTC 






CAAGTAGGAC 


AGGAACTCCG 


TCATAAGTTC 


GGAGGGTAAG 


6701 


AATTGCCACA 


GGAGCTTCTG 


ACACTGAAAA 


TTGCTGCTTC 


TTTGTAGGAA 




TTAACX3GTGT 


CCTCGAAGAC 


TGTGACTTTT 


AACGACGAAG 


AAACATCCTT 


6751 


TCCAAGCAAG 


TTGTAGCTCA 


TGGAAAGAGC 


TGTAGTGGAG 


AAGCACAACA 




AGGTTCGTTC 


AACATCGAGT 


ACCTTTCTCG 


ACATCACCTC 


TTCGTGTTGT 










Aval 


6801 


GGAGAGCAAT 


TTGGAGGAGA 


CACTTGTTGG 


TCATGTTCCT 


CGAGGCCTTT 




CCTCTCGTTA 


AACCTCCTCT 


GTGAACAACC 


AGTACAAGGA GCTCCGGAAA 












BaxnHI 


6851 


TTGGCCAGCT 


GGCGCCTGCT 


GCGCGACGGC 


GAGCTGCTCA 


CCACCCAGGA 






CCGCGGACGA 


CGCGCTGCCG 


CTCGACGAGT 


GGTGGGTCCT 




BamHI 










6901 


TCCGTCCCCC 


TTTTCCTTTG 


TCGATATCAT 


GTAATTAGTT 


ATGTCACGCT 




AGGCAGGGGG 


AAAAGGAAAC 


AGCTATAGTA 


CATTAATCAA 


TACAGTGCGA 


6951 


TACATTCACG 


CCCTCCCCCC 


ACATCCGCTC 


TAACCGAAAA GGAAGGAGTT 




ATGTAAGTGC 


GGGAGGGGGG 


TGTAGGCGAG 


ATTGGCTTTT 


CCTTCCTCAA 


7001 


AGACAACCTG 


AAGTCTAGGT 


CCCTATTTAT 


TTTTTTATAG 


TTATGTTAGT 




TCTGTTGGAC 


TTCAGATCCA 


GGGATAAATA 


AAAAAATATC 


AATACAATCA 


7051 


ATTAAGAACG 


TTATTTATA? 


TTCAAATTTT 


TCTTTTTTTT 


CTGTACAGAC 




TAATTCTTGC 


AATAAATATA 


AAGTTTAAAA 


AGAAAAAAAA 


GACATGTCTG 


7101 


GCGTGTACGC 


ATGTAACATT 


ATACTGAAAA 


CCTTGCTTGA 


GAAGGTTTTG 




CGCACATGCG 


TACATTGTAA 


TATGACTTTT 


GGAACGAACT 


CTTCCAAAAC 






Hindi I I 






7151 


GGACGCTCGA 


AGGCTTTAAT 


TTGCA 








CCTGCGAGCT 


TCCGAAATTA 


AACGT 







CaGALI P 
Hind m (6576) 
ApaU (6213) 
£coRI(6171) 



CaURA3 



£c<7R I (276) 

P«I(594) 
Pstli62l) 
Hind m {629) 
Ava I (647) 
>jrl(862) 



Ava I (4964) 



ApaU (4237) 




Apa U (3740) 



hIFN0 
3UTR 
Ava I (1422) 
J?/wiH 1(1446) 
A>nal(1451) 
uAvaI(1451) 
'^/nal(1453) 
|\£i:oRI(1502) 
Xmal{l5lS) 
UvalilSlS) 
\S/naI(I520) 
Cla I (1527) 
CYCT 
Ctol(2118) 
ORI 
AprtLI(2494) 
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1 TTCCATCGGG GAAAGTGGGG GGGAAAAAAT TTTAAGCAGT TCACAAAACC 
AAGGTAGCCC CTTTCACCCC CCCTTTTTTA AAATTCGTCA AGTGTTTTGG 



51 


TTCCAAAAAA 


TATATGGACA 


AAGATGATTG 


TATTTTCCCG 


ACACCAAAAT 




fVUJwi- X X ± 1. X 


ATATACCTGT 


TTCTACTAAC 


ATAAAAGGGC 


X w XV^J X X X X"f^ 


101 


CATAATTAAT 


TATGAGAAAG 


TTAAATGTAA 


CGTTACAATT 


TATGTTTATT 




GTATTAATTA 


ATACTCTTTC 


AATTTACATT 


GCAATGTTAA 


ATACAAATAA 


151 


TGAAGGTGAA 


AAGCQATTTA 


TGATTTTTCC 


GAAATGAAAA 


TTTTTTTTAG 




ACTTCCACTT 


TTCGCTAAAT 


ACTAAAAAGG 


CTTTACTTTT 


AAAAAAAATC 


201 


GTTTATTTTT 


TTTGTCGGGC 


AAAGAAAAAC 


TGAACAAGGA 


TTATTAAAAT 




CAAATAAAAA 


AAACAGCCCG 




ACTTGTTCCT 


AATAATTTTA 








EcoRI 






251 


TTTTGGTGTT 


TGTTTGTGTC 


TGGAGAATTC 


ATTCCTCTCT 


CATCTTCACA 








ACCTCTTAAG 


TAAGGAGAGA 


GTAGAAGTGT 


301 


CAATGTTTAG 


ACATCTGACA 


CGATTCATGA 


TAGTTCGGTT 


TCCGGGGTTG 




GTTACAAATC 


TGTAGACTGT 


GCTAAGTACT 


ATCAAGCCAA 


AGGCCCCAAC 


351 


GTGTTTAGTT 


TTCGTTTTTC 




GAAAGAATGT 


TTTAGCTCAT 




CACAAATCAA 


AAGCAAAAAG 


AAAAAAAAAC 


CTTTCTTACA 


AAATCGAGTA 


401 


TGGTTTTCTT 


TCTTCATTCA 


ATAGTTTTGA 


AAGAATTTGC 


CCACTTGTTA 




ACCAAAAGAA 


AGAAGTAAGT 


TATCAAAACT 


TTCTTAAACG 


GGTGAACAAT 


451 


TTACAATCAT 


ATAAAATTAA 


ACTTTGATAT 


AAAATAGAGT 


TTGAAAGTTT 




AATGTTAGTA 


TATTTTAATT 


TGAAACTATA 


TTTTATCTCA 


AACTTTCAAA 



501 CCCAGATCCT TTTTGATTTC TTTGTAAATT TTTTTTTCTC CCACATATAC 
GGGTCTAGGA AAAACTAAAG AAACATTTAA AAAAAAAGAG GGTGTATATG 



PstI 



551 ACACATACAA ACCGATTTTT ATAAGAAAGA GTTATACCCT GCAGCTCGAC 
TGTGTATGTT TGGCTAAAAA TATTCTTTCT CAATATGGGA CGTCGAGCTG 



PstI Hindi II Aval 



601 CTCGACTGTT TAA^CCTGCA GGCATGCAAG CTTGGCCAAA AAGGCCTCGA 
GAGCTGACAA ATTTGGACGT CCGTACGTTC GAACCGGTTT TTCCGGAGCT 



Aval 

651 GGAACATGAC CAACAAGTGT CTCCTCCAAA TTGCTCTCCT GTTGTGCTTC 
CCTTGTACTG GTTGTTCACA GAGGAGGTTT AACGAGAGGA CAACACGAAG 



701 TCCACTACAG CTCTTTCCAT GAGCTACAAC TTGCTTGGAT TCCTACAAAG 
AGGTGATGTC GAGAAAGGTA CTCGATGTTG AACGAACCTA AGGATGTTTC 



751 


AAGCAGCAAT 
TTCGTCGTTA 


TTTCAGTGTC 
AAAGTCACAG 


AGAAGCTCCT 
TCTTCGAGGA 


GTGGCAATTG 
CACCGTTAAC 


AATGGGAGGC 
TTACCCTCCG 


801 


TTGAATACTG 
AACTTATGAC 


CCTCAAGGAC 
GGAGTTCCTG 


AGGATGAACT 
TCCTACTTGA 


TTGACATCCC 
AACTGTAGGG 


TGAGGAGATT 
ACTCCTCTAA 



f=^iliied:2d;W:1.9.99| 



" ' 06/07/98 16:24:39 



■pGAL'f PS i ST - 1 



Page 2 



PstI 



851 AAGCAGCTGC AGCAGTTCCA GAAGGAGGAC GCCGCATTGA CCATCTATGA 
TTCGTCGACG TCGTCAAGGT CTTCCTCCTG CGGCGTAACT GGTAGATACT 



901 GATGCTCCAG AACATCTTTG CTATTTTCAG ACAAGATTCA TCTAGCACTG 
CTACGAGGTC TTGTAGAAAC GATAAAAGTC TGTTCTAAGT AGATCGTGAC 



951 GCTGGAATGA GACTATTGTT GAGAACCTCC TGGCTAATGT CTATCATCAG 
CGACCTTACT CTGATAACAA CTCTTGGAGG ACCGATTACA GATAGTAGTC 



1001 ATAAACCATC TGAAGACAGT CCTGGAAGAA AAACTGGAGA AAGAAGATTT 
TATTTGGTAG ACTTCTGTCA GGACCTTCTT TTTGACCTCT TTCTTCTAAA 



1051 CACCAGGGGA AAACTCATGA GCAGTCTGCA CCtGAAAAGA TATTATGGGA 
GTGGTCCCCT TTTGAGTACT CGTCAGACGT GGACTTTTCT ATAATACCCT 



1101 GGATTCTGCA TTACCTGAAG GCCAAGGAGT ACAGTCACTG TGCCTGGACC 
CCTAAGACGT AATGGACTTC CGGTTCCTCA TGTCAGTGAC ACGGACCTGG 



1151 ATAGTCAGAG TGGAAATCCT AAGGAACTTT TACTTCATTA ACAGACTTAC 
TATCAGTCTC ACCTTTAGGA TTCCTTGAAA ATGAAGTAAT TGTCTGAATG 



1201 AGGTTACCTC CGAAACTGAA GATCTCCTAG CCTGTGCCTC TGGGACTGGA 
TCCAATGGAG GCTTTGACTT CTAGAGGATC GGACACGGAG ACCCTGACCT 



1251 CAATTGCTTC AAGCATTCTT CAACCAGCAG ATGCTGTTTA AGTGACTGAT 
GTTAACGAAG TTCGTAAGAA GTTGGTCGTC TACGACAAAT TCACTGACTA 



1301 GGCTAATGTA CTGCATATGA AAGGACACTA GAAGATTTTG AAATTTTTAT 
CCGATTACAT GACGTATACT TTCCTGTGAT CTTCTAAAAC TTTAAAAATA 



1351 TAAATTATGA GTTATTTTTA TTTATTTAAA TTTTATTTTG GAAAATAAAT 
ATTTAATACT CAATAAAAAT AAATAAATTT AAAATAAAAC CTTTTATTTA 



XmaX 
Sxnal 
BamHI 



Aval Aval 



1401 TATTTTTGGT GCAAAAGTCC CTCGAGGCCT AGCGGCCGCC TAGAGGATCC 
ATAAAAACCA CGTTTTCAGG GAGCTCCGGA TCGCCGGCGG ATCTCCTAGG 



Xmal 



Smal 



Aval 



1451 CCGGGCGCTA GGCGGCCGCT AGGCCTTTTT GGCCAAGCTC GAATTTCGAG 
GGCCCGCGAT CCGCCGGCGA TCCGGAAAAA CCGGTTCGAG CTTAAAGCTC 



EcoRI Aval Clal 



1501 GAATTCGAGC TCGGTACCCG GGGGATCGAT CCGTCCCCCT TTTCCTTTGT 
CTTAAGCTCG AGCCATGGGC CCCCTAGCTA GGCAGGGGGA AAAGGAAACA 



1551 CGATATCATG TAATTAGTTA TGTCACXKTTT ACATTCACGC CCTCCCCCCA 
GCTATAGTAC ATTAATCAAT ACAGTGCGAA TGTAAGTGCX5 GGAGGGGGGT 



1601 CATCCGCTCT AACCGAAAAG GAAGGAGTTA GACAACCTGA AGTCTAGGTC 
GTAGGCGAGA TTGGCTTTTC CTTCCTCAAT CTGTTGGACT TCAGATCCAG 



1651 CCTATTTATT TTTTTATAGT TATGTTAGTA TTAAGAACGT TATTTATATT 
GGATAAATAA AAAAATATCA ATACAATCAT AATTCTTGCA ATAAATATAA 



1701 TCAAATTTTT CTTTTTTTTC TGTACAGACG CGTGTACGCA TGTAACATTA 
AGTTTAAAAA GAAAAAAAAG ACATGTCTGC GCACATGCGT ACATTGTAAT 



1751 TACTGAAAAC CTTGCTTGAG AAGGTTTTGG GACGCTCGAA GGCTTTAATT 
ATGACTTTTG GAACGAACTC TTCCAAAACC CTGCGAGCTT CCGAAATTAA 



1801 TCCAAGCTAG CTTGGCGTAA TCATGGTCAT AGCTGTTTCC TGTGT GAAA T 
ACGTTCGATC GAACCGCATT AGTACCAGTA TCGACAAAGG ACACACTTTA 



1851 TGTTATCCGC TCACAATTCC ACACAACATA CGAGCCGGAA GCATAAAGTG 
ACAATAGGCG AGTGTTAAGG TGTGTTGTAT GCTCGGCCTT CGTATTTCAC 



1901 TAAAGCCTGG GGTGCCTAAT GAGTGAGCTA ACTCACATTA ATTGCGTTGC 
ATTTCGGACC CCACGGATTA CTCACTCGAT TGAGTGTAAT TAACGCAACX3 



1951 GCTCACTGCC CGCTTTCCAG TCGGGAAACC TGTCGTGCCA GAGATCTCTG 
CGAGTGACGG GCGAAAGGTC AGCCCTTTGG ACAGCACGGT CTCTAGAGAC 



2001 CATTAATGAA TCGGCCAACG CGCGGGGAGA GGCGGTTTGC GTATTGGGCG 
GTAATTACTT AGCCGGTTGC GCGCCCCTCT CCGCCAAACG CATAACCCGC 



2051 CTCTTCCGCT TCCTCGCTCA CTGACTCGCT GCGCTCGGTC GTTCGGCTGC 
GAGAAGGCGA AGGAGCGAGT GACTGAGCGA CGCGAGCCAG CAAGCCQACG 



Clal 



2101 GGCGAGCGGT ATCAGATCGA TCTCACTCAA AGGCGGTAAT ACGGTTATCC 
CCGCTCGCCA TAGTCTAGCT AGAGTGAGTT TCCGCCATTA TGCCAATAGG 



2151 ACAGAATCAG GGGATAACGC AGGAAAGAAC ATGTGAGCAA AAGGCCAGCA 
TGTCTTAGTC CCCTATTGCG TCCTTTCTTG TACACTCGTT TTCCGGTCGT 



2201 AAAGGCCAGG AACCGTAAAA AGGCCGCGTT GCTGGCGTTT TTCCATAGGC 
TTTCCGGTCC TTGGCATTTT TCCGGCGCAA CGACCGCAAA AAGGTATCCG 



2251 TCCGCCCCCC TGACGAGCAT CACAAAAATC GACGCTCAAG TCAGAGGTGG 
AGGCGGGGGG ACTGCTCGTA GTGTTTTTAG CTGCGAGTTC AGTCTCCACC 



2301 CGAAACCCGA CAGGACTATA AAGATACCAG GCGTTTCCCC CTGGAAGCTC 
GCTTTGGGCT C^CTGATAT TTCTATGGTC CGCAAAGGGG GACCTTCGAG 



2351 CCTCGTGCGC TCTCCTGTTC CGACCCTGCC GCTTACCGGA TACCTGTCCG 
GGAGCACGCG AGAGGACAAG GCTGGGACGG CGAATGGCCT ATGGACAGGC 



2401 CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT CTCATAGCTC ACGCTGTAGG 
GGAAAGAGGG AAGCCCTTCG CACCGCGAAA GAGTATCGAG TGCGACATCC 



ApaLI 



2451 TATCTCAGTT CGGTGTAGGT CGTTCGCTCC AAGCTGGGCT GTGTGCACGA 
ATAGAGTCAA GCCACATCCA GCAAGCGAGG TTCGACCCGA CACACGTGCT 



2501 ACCCCCCGTT CAGCCCGACC GCTGCGCCTT ATCCC5GTAAC TATCGTCTTG 
TGGGGGGCAA GTCGGGCTGG CGACGCGGAA TAGGCCATTG ATAGCAGAAC 



2551 AGTCCAACCC GGTAAGACAC GACTTATCGC CACTGGCAGC AGCCACTGGT 
TCAGGTTGGG CCATTCTGTG CTGAATAGCG GTGACCGTCG TCGGTGACXA 



2601 AACAGGATTA GCAGAGCGAG GTATGTAGGC GGTGCTACAG AGTTCTTGAA 
TTGTCCTAAT CGTCTCGCTC CATACATCCG CCACGATGTC TCAAGAACTT 



2651 GTGGTGGCCT AACTACGGCT ACACTAGAAG GACAGTATTT GGTATCTGCG 
CACCACCGGA TTGATGCCGA TGTGATCTTC CTGTCATAAA CCATAGACGC 



2701 CTCTGCTGAA GCCAGTTACC TTCGGAAAAA GAGTTGGTAG CTCTTGATCC 
GAGACGACTT CGGTCAATGG AAGCCTTTTT CTCAACCATC GAGAACTAGG 



2751 GGCAAACAAA CCACCGCTGG TAGCGGTGGT TTTTTTGTTT GCAAGCAGCA 
CCGTTTGTTT GGTGGCGACC ATCGCCACCA AAAAAACAAA CGTTCGTCGT 



2801 GATTACGCGC AGAAAAAAAG GATCTCAAGA AGATCCTTTG ATCTTTTCTA 
CTAATGCGCG TCTTTTTTTC CTAGAGTTCT TCTAGGAAAC TAGAAAAGAT 



2851 CGGGGTCTGA CGCTCAGTGG AACGAAAACT CACGTTAAGG GATTTTGGTC 
GCCCCAGACT GCGAGTCACC TTGCTTTTGA GTGCAATTCC CTAAAACCAG 



2901 ATGAGATTAT CAAAAAGGAT CTTCACCTAG ATCCTTTTAA ATTAAAAATG 
TACTCTAATA GTTTTTCCTA GAAGTGGATC TAGGAAAATT TAATTTTTAC 



2951 AAGTTTTAAA TCAATCTAAA GTATATATGA GTAAACTTGG TCTGACAGTT 
TTCAAAATTT AGTTAGATTT CATATATACT CATTTGAACC AGACTGTCAA 



3001 ACCAATGCTT AATCAGTGAG GCACCTATCT CAGCGATCTG TCTATTTCGT 
TGGTTACGAA TTAGTCACTC CGTGGATAGA GTCGCTAGAC AGATAAAGCA 



3051 TCATCCATAG TTGCCTGACT CCCCGTCGTG TAGATAACTA CGATACGGGA 
AGTAGGTATC AACGGACTGA GGGGCAGCAC ATCTATTGAT GCTATGCCCT 



3101 GGGCTTACCA TCTGGCCCCA GTGCTGCAAT GATACCGCGA GACCCACGCT 
CCCGAATGGT AGACCGGGGT CACGACGTTA CTATGGCGCT CTGGGTGCGA 



3151 CACCGGCTCC AGATTTATCA GCAATAAACC AGCCAGCCGG AAGGGCCGAG 
GTGGCCGAGG TCTAAATAGT CGTTATTTGG TCGGTCGGCC TTCCCGGCTC 



3201 CGCAGAAGTG GTCCTGCAAC TTTATCCGCC TCCATCCAGT CTATTAATTG 
GCGTCTTCAC CAGGACGTTG AAATAGGCGG AGGTAGGTCA GATAATTAAC 



3251 TTGCCGGGAA GCTAGAGTAA GTAGTTCGCC AGTTAATAGT TTGCGCAACG 
AACGGCCCTT CGATCTCATT CATCAAGCGG TCAATTATCA AACX3CGTTGC 



3301 TTGTTGCCAT TGCTACAGGC ATCGTGGTGT CACGCTCGTC GTTTGGTATG 
AACAACGGTA ACGATGTCCG TAGCACCACA GTGCGAGCAG CAAACCATAC 



3351 GCTTCATTCA GCTCCGGTTC CCAACGATCA AGGCGAGTTA CATGATCCCC 
CGAAGTAAGT CGAGGCCAAG GGTTGCTAGT TCCGCTCAAT GTACTAGGGG 



3401 CATGTTGTGC AAAAAAGCGG TTAGCTCCTT CGGTCCTCCG ATCGTTGTCA 
GTACAACACG TTTTTTCGCC AATCGAGGAA GCCAGGAGGC TAGCAACAGT 



34 51 GAAGTAAGTT GGCCGCAGTG TTATCACTCA TGGTTATGGC AGCACTGCAT 
CTTCATTCAA CCGGCGTCAC AATAGTGAGT ACCAATACCG TCGTGACGTA 



3501 AATTCTCTTA CTGTCATGCC ATCCGTAAGA TGCTTTTCTG TGACTGGTGA 
TTAAGAGAAT GACAGTACXXJ TAGGCATTCT ACGAAAAGAC ACTGACCACT 



3551 GTACTCAACC AAGTCATTCT GAGAATAGTG TATGCGGCGA CCGAGTTGCT 
CATGAGTTGG TTCAGTAAGA CTCTTATCAC ATACGCCGCT GGCTCAACGA 



3601 CTTGCCCGGC GTCAATACGG GATAATACCG CGCCACATAG CAGAACTTTA 
GAACGGGCCG CAGTTATGCC CTATTATGGC GCGGTGTATC GTCTTGAAAT 



3651 AAAGTGCTCA TCATTGGAAA ACGTTCTTCG GGGCGAAAAC TCTCAAGGAT 
TTTCACGAGT AGTAACCTTT TGCAAGAAGC CCCGCTTTTG AGAGTTCCTA 



ApaLI 



3701 CTTACCGCTG TTGAGATCCA GTTCGATGTA ACCCACTCGT GCACCCAACT 
GAATGGCGAC AACTCTAGGT CAAGCTACAT TGGGTGAGCA CGTGGGTTGA 



3751 GATCTTCAGC ATCTTTTACT TTCACCAGCG TTTCTGGGTG AGCAAAAACA 
CTAGAAGTCG TAGAAAATGA AAGTGGTCGC AAAGACCCAC TCGTTTTTGT 



3801 GGAAGGCAAA ATGCCGCAAA AAAGGGAATA AGGGCGACAC GGAAATGTTG 
CCTTCCGTTT TACGGCGTTT TTTCCCTTAT TCCCGCTGTG CCTTTACAAC 



3851 AATACTCATA CTCTTCCTTT TTCAATATTA TTGAAGCATT TATCAGGGTT 
TTATGAGTAT GAGAAGGAAA AAGTTATAAT AACTTCGTAA ATAGTCCCAA 



3901 ATTGTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA AAATAAACAA 
TAACAGAGTA CTCGCCTATG TATAAACTTA CATAAATCTT TTTATTTGTT 



3951 ATAGGGGTTC CGCGCACATT TCCCCGAAAA GTGCCACCTG ACGTCTAAGA 
TATCCCCAAG GCGCGTGTAA AGGGGCTTTT CACGGTGGAC TGCAGATTCT 



4001 AACCATTATT ATCATGACAT TAACCTATAA AAATAGGCGT ATCACGAQGC 
TTGGTAATAA TAGTACTGTA ATTGGATATT TTTATCCGCA TAGTGCTCCG 



4051 CCTTTCGTCT CGCGCGTTTC GGTGATGACG GTGAAAACCT CTGACACATG 
GGAAAGCAGA GCGCGCAAAG CCACTACTGC CACTTTTGGA GACTGTGTAC 



4101 CAGCTCCCGG AGACGGTCAC AGCTTGTCTG TAAGCGGATG CCX3GGAGCAG 
GTCGAGGGCC TCTGCCAGTG TCGAACAGAC ATTCGCCTAC GGCCCTCGTC 



4151 ACAAGCCCGT CAGGGCGCGT CAGCGGGTGT TGGCGGGTGT CGGGGCTGGC 
TGTTCGGGCA GTCCCGCGCA GTCGCCCACA ACCGCCCACA GCCCCGACCG 



4201 TTAACTATGC GGCATCAGAG CAGATTGTAC TGAGAGTGCA CCATATCGAC 
AATTGATACG CCGTAGTCTC GTCTAACATG ACTCTCACGT GGTATAGCTG 



4251 GCTCTCCCTT ATGCGACTCC TGCATTAGGA AGCAGCCCAG TAGTAGGTTG 
CGAGAGGGAA TACGCTGAGG ACGTAATCCT TCGTCGGGTC ATCATCCAAC 



4301 AGGCCGTTGA GCACCGCCGC CGCAAGGAAT GGTGCATGCA AGGAGATGGC 
TCCGGCAACT CGTGGCGGCG GCGTTCCTTA CCACGTACGT TCCTCTACCG 



4351 GCCCAACAGT CCCCCGGCCA CGGGGCCTGC CACCATACCC ACGCCGAAAC 
CGGGTTGTCA GGGGGCCGGT GCCCCGGACXJ GTGGTATGGG TGCGGCTTTG 



4401 AAGCACTAAT AGGAATTGAT TTGGATGGTA TAAACGGAAA CAAAAAAAAG 
TTCGTGATTA TCCTTAACTA AACCTACCAT ATTTGCCTTT GTTTTTTTTC 



4451 AGCTGGTACT AC T TTCTTTA AAATTATTTT ATTATTTGAT TTTATTTAAT 
TCGACCATGA TGAAAGAAAT TTTAATAAAA TAATAAACTA AAATAAATTA 



4501 AGTATATATT ATATTTTGAA CGTAGATTAT TTTGTTGAAA GTTGCTGTAG 
TCATATATAA TATAAAACTT GCATCTAATA AAACAACTTT CAACGACATC 



4551 TGCCATTGAT TCGTAACACT AATTCTGTAT TAGTCATTCC TCTTGTTTGA 
ACGGTAACTA AGCATTGTGA TTAAGACATA ATCAGTAAGG AGAACAAACT 



4601 TAGTATCCAA AAAAACGGCT ATTTTTTTGC AATCTTATTT CCTGCATATT 
ATCATAGGTT TTTTTGCCGA TAAAAAAACG TTAGAATAAA GGACGTATAA 



4651 ATACAGATAA CATAATGAAA GAAAAAATCT TTTTTTTTGT TCTTCAATGA 
TATGTCTATT GTATTACTTT CTTTTTTAGA AAAAAAAACA AGAAGTTACT 



4701 TGATTTCAAC CATTCTTTTA AACATTGATC AATTCCTGAG CAACAACCCX: 
ACTAAAGTTG GTAAGAAAAT TTGTAACTAG TTAAGGACTC GTTGTTGGGG 



4751 ATACACACTG GTTTATATAC CGCCCCTTTT ACAGTTGAAG AAAGAAATAG 
TATGTGTGAC CAAATATATG GCGGGGAAAA TGTCAACTTC TTTCTTTATC 



4801 AAATAGAAAT AGCAAACAAA AGATATGACA GTCAACACTA AGACCTATAG 
TTTATCTTTA TCGTTTGTTT TCTATACTGT CAGTTGTGAT TCTGGATATC 



4851 TGAGAGAGCA GAAACTCATG CCTCACCAGT AGCACAGCGA TTATTTCGAT 
ACTCTCTCGT CTTTGAGTAC GGAGTGGTCA TCGTGTCGCT AATAAAGCTA 



4901 TAATGGAACT GAAGAAAACC AATTTATGTG CATCAATTGA CGTTGATACC 
ATTACCTTGA CTTCTTTTGG TTAAATACAC GTAGTTAACT GCAACTATGG 



Aval 



4951 ACTAAGGAGT TCCTCGAGTT AATTGATAAA TTAGGTCCTT ATGTATGCTT 
TGATTCCTCA AGGAGCTCAA TTAACTATTT AATCCAGGAA TACATACGAA 



5001 AATCAAGACT CATATTGATA TAATCAATGA TTTTTCCTAT GAATCCACTA 
TTAGTTCTGA GTATAACTAT ATTAGTTACT AAAAAGGATA CTTAGGTGAT 



5051 TTGAACCATT ATTAGAACTT TCACGTAAAC ATCAATTTAT GATTTTTGAA 
AACTTGGTAA TAATCTTGAA AGTGCATTTG TAGTTAAATA CTAAAAACTT 



5101 GATAGAAAAT TTGCTGATAT TGGTAATACC GTAAAGAAAC AATATATTGG 
CTATCTTTTA AACGACTATA ACCATTATGG CATTTCTTTG TTATATAACC 



5151 TGGAGTTTAT AAAATTAGTA GTTGGGCAGA TATTACCAAT GCTCATGGTG 
ACCTCAAATA TTTTAATCAT CAACCCGTCT ATAATGGTTA CGAGTACCAC 



5201 TCACTGGGAA TGGAGTGGTT 3.AAGGATTAA AACAGGGAGC TAAAGAAACC 
AGTGACCCTT ACCTCACCAA CTTCCTAATT TTGTCCCTCG ATTTCTTTGG 



5251 ACCACCAACC AAGAGCCAAG AGGGTTATTG ATGTTAGCTG AATTATCATC 
TGGTGGTTGG TTCTCGGTTC TCCCAATAAC TACAATCGAC TTAATAGTAG 



5301 AGTGGGATCA TTAGCATATG GAGAATATTC TCAAAAAACT GTTGAAATTG 
TCACCCTAGT AATCGTATAC CTCTTATAAG AGTTTTTTGA CAACTTTAAC 



53 51 CTAAATCCGA TAAGGAATTT GTTATTGGAT TTATTGCCCA ACGTGATATG 
GATTTAGGCT ATTCCTTAA;^. CAATAACCTA AATAACGGGT TGCACTATAC 



5401 GGTGGCCAAG AAGAAGGATT TGATTGGCTT ATTATGACAC CTGGAGTTGG 
CCACCGGTTC TTCTTCCTAA ACTAACCGAA TAATACTGTG GACCTCAACC 



5451 ATTAGATGAT AAAGGTGATG GATTAGGACA ACAATATAGA ACTGTTGATO 
TAATCTACTA TTTCCACTAC CTAATCCTGT TGTTATATCT TGACAACTAC 



5501 AAGTTGTTAG CACTGGAACT GATATTATCA TTGTTGGTAG AGGATTGTTT 
TTCAACAATC GTGACCTTGA CTATAATAGT AACAACCATC TCCTAACAAA 



5551 GGTAAAGGAA GAGATCCAGA TATTGAAGGT AAAAGGTATA GAAATGCPGG 
CCATTTCCTT CTCTAGGTCT ATAACTTCCA TTTTCCATAT CTTTACGACC 



5601 TTGGAATGCT TATTTGAAAA AGACTGGCCA ATTATAAATG TGAAGGGGGA 
AACCTTACGA ATAAACTTTT TCTGACCGGT TAATATTTAC ACTTCCCCCT 



5651 GATTTTCACT TTATTAGATT TGTATATATG TAGAATAAAT AAATAAATAA 
CTAAAAGTGA AATAATCTAA ACATATATAC ATCTTATTTA TTTATTTATT 



5701 GTTAAATAAA TAATTAAATA AGGGTGGTAA TTATTACTAT TTACAATCAA 
CAATTTATTT ATTAATTTAT TCCCACCATT AATAATGATA AATGTTAGTT 



5751 AGGTGGTCCT TCTAGCTGTA ATCCGGGCAG CGCAACGGAA CATTCATCAG 
TCCACCAGGA AGATCGACAT TAGGCCCGTC GCGTTGCCTT GTAAGTAGTC 



5801 TGTAAAAATG GAATCAATAA AGCCCTGCGC TCATGAGCCC GAAGTGGCGA 
ACATTTTTAC CTTAGTTATT TCGGGACGCG AGTACTCGGG CTTCACCGCT 



5851 GCCCGATCTT CCCCATCGGT GATGTCGGCG ATATAGGCGC CAGCAACCGC 
CGGGCTAGAA GGGGTAGCCA CTACAGCCGC TATATCCGCG GTCGTTGGCG 



5901 ACCTGTGGCG CCGCAGCGCG CAGGGTCAGC CTGAATACGC GTTTAATGAC 
TGGACACCGC GGCGTCGCGC GTCCCAGTCG GACTTATGCG CAAATTACTG 



5951 CAGCACAGTC GTGATGGCAA GGTCAGAATA GCCCAAGTCG GCCGAGGGGC 
GTCGTGTCAG CACTACCGTT CCAGTCTTAT CGGGTTCAGC CGGCTCCCCG 



6001 CTGTACAGTG AGGGAAGATC TGATATTGAC GAAGAGGAAC CAATGTAACG 
GACATGTCAC TCCCTTCTAG ACTATAACTG CTTCTCCTTG GTTACATTGC 



6051 TTACACTGAA GAAAACACAC AATAAACGGG AAGAAACGGT GTAAAAGTGT 
AATGTGACTT CTTTTGTGTG TTATTTGCCC TTCTTTGCCA CATTTTCACA 



6101 GAAAATAATT TTTGAATATC ATTTCCCTTG GTTTAATTCC AAACGAAACG 
CTTTTATTAA AAACTTATAG TAAAGGGAAC CAAATTAAGG TTTGCTTTQC 



EcoRI 



6151 TGTTTTTTTT AGAGAATGGG AATTCTTATT GGATGTCTAG ATTGTTTGTT 
ACAAAAAAAA TCTCTTACCC TTAAGAATAA CCTACAGATC TAACAAACAA 



ApaLI 



6201 TACTCCAGAC TGTGCACAAA AACGTTTGGA TGGATGATCA GAAGATATTT 
ATGAGGTCTG ACACGTGTTT TTGCAAACCT ACCTACTAGT CTTCTATAAA 



6251 TTAGGCTTAG CTCTAAATAT AAGAAATGAT GCTTGAAAAA CCAGACAGAA 
AATCCGAATC GAGATTTATA TTCTTTACTA CGAACTTTTT GGTCTGTCTT 



6301 ATTGAGTTTC AAAAATTGGT AATGTGAGGT ATTAGTCAAC TAACCAAATA 
TAACTCAAAG TTTTTAACCA TTACACTCCA TAATCAGTTG ATTGGTTTAT 



6351 ACAATGCAAA CCGGTTGATA CATTTCATTT TGAAAATAAT GAAACTGGAA 
TGTTACGTTT GGCCAACTAT GTAAAGTAAA ACTTTTATTA CTTTGACCTT 



6401 TTGGATGACC AGCACACAAA CACATAAAGT AATTATGGGA ATTAGAAGCG 
AACCTACTGG TCGTGTGTTT GTGTATTTCA TTAATACCCT TAATCTTCGC 



6451 AACATAGAGG AGTACTTGGC CACGAACAGA ATACAAGTGG GAACACTATT 
TTGTATCTCC TCATGAACCX; GTGCTTGTCT TATGTTCACC CTTGTGATAA 



6501 TTCTCCATTG TTTTAGTTCT GTTTTTTTGT CAGCCTAGTT TTGTGCTATG 
AAGAGGTAAC AAAATCAAGA CAAAAAAACA GTCGGATCAA AACACGATAC 



Hindi 1 1 

6551 TGTAAAAAAT ATTGCCAAGA AAAAAAGCTT GTTTTGTGGC CAGTGTCCGA 
ACATTTTTTA TAACGGTTCT TTTTTTCGAA CAAAACACCG GTCACAGGCT 



6601 AAAAAATTTT GGGGAATCTT CX3GATTAATT TATGTTTTCA 
TTTTTTAAAA CCCCTTAGAA GCCTAATTAA ATACAAAAGT 




'^^:•:W:^'^•^:•%^^:-v^:-:-:-^:-^^^:-:-^:•:.^^^^^^ 



Sequences with unknown function, C. albicans sequence NOT present in the public domain 
(ALCES/EMBL) 

>328c2 I803bp in-house: 1123-1803 public: 1-436/468-1021 PathoSeq: 
437-467/1022-1122 

ATGTCTATTACAGTTACATTTCCGAAATCTCCATCTACGAAAAAACGTGCACCG 
GCATTTGGAATTGAGTTGGAGTTYAG 

TCAMCAAGSCAGTAGCGATGGTGCTATAGAGAAAAGCGGCATTGGCAGTTCCT 
GTGTrTAGCGTTGACA.\CCAAGACTWT 

GTATTKATAAGAGAYCVVTGCCAAGTACTGGGGCTACCCTTCATCGTATCAATT 
GATTGTCAAGTTGGTCAAATGTGCTAA 

CATTGAAAAGTCGCAAATCTTAAAGACCGATAAGGATTTGAATAGAGAGTTGT 
TTGAGTTGGATTTGATTGAAGAAGCAG 

ATACAAAGATTGATCTTTTITATATTTCGTTACCCTrGGTCTATTCAAGAATAGA 
AAATAAGAAGGl ill I iATGTTCTG 

CGTGAACCAGAACAGCCAAAGGTGTCGAAAGCMCCAACACAAGAGAAACCAG 
CAAGTGTGGTTGCTGCAGAAGAAGATGA 

CGATAATCTAGATGATGATGAGGAGGACGAAGTGGATGAAGACATGGATGAA 
GATAATGATAATAGTGGGGAATTGTCTA 

AAGGATACAAGCACATGCACAAGGACCATCCAAAGTATATAAATGACGATAG 

GGTTACTATTGGACAAGTGTTTCATCAA 

TACGGACTTGACCCTTCGACACCATTAACCCATTCACTTTTCAATAGTATCAAC 
TCAATGTCGAAGCTAA.\CTATTACAA 

GAATTTTGGAGTTTCAGGTTACCGATrTCTTCCCAACAGCAAGTTATCTTATGC 
AGAACGAGAATTGGTGTTGAATGCCA 

ACAACTACAATGATATGCACATTAACGAAAAGACAGAATCCAAGCCGAAAAA 
GAGTTTCCGTAAACCCATTGGAAAGTCA 

AAGAAACATAACTTGCAGATTGATCCGAACTCCATAGATTTAAGCGAGTCAGT 
GATTCCGGGACAAGGGTTTATACCTGA 

CTTTAGTATCCACCTATCTTTGCAAAGTCCCTAATTATTATGTGACATCAACCC 
ACCAAAGTCTCCCGCTGTCGTTCAAC 

ACAAAGAATCTTAATGCAACTTCGAACTCTTCGTATTTGTTTAATGATAATGTC 
AAGATAAAGTCAAAAAGTATTCAGAA 

GTWSGTGTTCAACAGCGATACCGATAATTACCATCACACAAAGTATTTCTACA 
CCAAAACCTACCGTGGTCCAGGGTCGG 

GG AATTACAAGGATGGTGCATTGATGAACAAAATCAACAAGATACATCTTTCC 
AGTAATAAAAAGCCGCGCCACAAGAGA 

AAGGTGTCGAACAATAACAGGTACAACAAGAGTITAAAGGGGTTAGTCCACG 

AAAAGTTTGACAAGAACTTTGTTGAGTA 

CTTGCTTTCTGAGCAACGCAAGTATACCGAGGACTATTCCAATCTTGAAATTTT 

ACACAATAGCTTACAGTTTAATGTTC 

TTTTGAATACGTATCGTGGTGTTGCCCAAGAGACATGGAATAACTACTACAAG 
TTTAAATTGATTGATTTCGAACAATTG 

AAGGCTTTGCAAATGGAGGCAAATGAGCTTGAGGAGAGAAAATTGGATGCTG 
CTAGACACCAACAGTGGGCGGAAGAAGA 

GAAGCnTTNCCAAGAAAGATTGCGTTTAGTATTTGAAGATGAACGGACGAGTT 
TGAGCAATTGCAAAGCGAGTTTGGTCA 



B'0 



# 



GAGAAAGAAGGATTTCKjAAGAGAAATTGCGTCGCCGTCACK:TANANGCATC^ 
TGAhrrGATAGTTTTGAACTTGATAGCG 

AAAATGACNATGAATCTTGACTTGNCCAAANTNAACAAGACTT 



>II3g4 844bp izi-house 1-844 

ATAGAACTGTTTGATATACAACTATCTCACTCCC^ATTCTGACTTGAiTAAATAATAATACCTATCACCTAGTAATC^ 

ATCTTAACGTAATCrCTGCAAAGCACAATCAATGTATAAAAGCATAAAGATAAAATCTTGGTGAGGTTTAAGT^ 

TATAATGAACAACAATTACTAAAAGGGATGGTATCAACAAATTATAGGCTAGGTACAACCATAGTGCrTGTTCGGGAGTT 

CGGGTAGTTTGGGAAGGTTGGGAAGGTrGGATAGTTTGAGAAGGTTCaSTGGCTGArrCTAAATTAACA^ 

AATGTACAAAAAAACATTCAGAATmAJUCAACCTTTATATATATATATTAAATGCTCTTGTCATCAACT^ 

TGTTGATGATGCTTTCCTGTTAAATATACCTTTAAGAACCAGATTC 

GTTTTGACAmCATAATGACACAAAAGATTGTGAAATATlTrrAGCCTCAAGGGGArrCTA 

CACATTCTTTGTATCACCAATACCTTTTGCTAACAGAGGAACAAAAAATTGACACGCATGTCATTTACCCTAT 

TCACTACAAATCAAAGGATTTACAATAGTGGCAATGTCAAATCATGTATArrATTAACACATTACACATATTTAr^ 

GGTACATAATACTCAATATCTAAAACTTCAAAATGGTACTGTACCTTAAACmCTCCTTCATGTCTAGTO 

ACTTGCTAATGTCAAAAATCATGTCTTCACACATTCCAGGTTGT 




■1 



>l5cl 977bp in-house i-9'7 hp 

TTmTTTACAATATAGTTAGATCTCTTTTTAAAiTTTGAACACAAAlAAACAA 

CCACCACCAAAACATCATAGTGCAACTTAArrGAAGAATATATTAATAACCATTAATT^TAATAACATACTCAAAAGGA^ 
TAGGAGTAAAACCTTTATATGTAAATTAATTAAATAGCAAAAAAAAAAAAAAAAAAGGAAAGATTTCAACAAATCTT 
ATTAAATTAAATTTCAmCATTTCnGUAGTCATATAGTCGTAATAGCAGTAATArTAGCAATAATAm 
TTTAAAAATAACAATTATAATAATAGTAXTAATAAACGAATTTAACAAAACAAAAAAAGGGGGGGGAAGACAACGAAT^^ 
AGAAGAAGAAAAAACAACAAGACGGGTAGTAGATATATCTGGCTTAAAAAAGCATATCTAAAGTACAGCAAACACATAAT f^^ ^ 
GCAGC A AGACAACCCATTa A AC A AGAATCArrACCTCCAGAACGTGGTTGTTGTTGTACATACATAGGTTGTrGnGrrG f^f/j 
TTGTTGTTGATAATATCCACCACCACCACCTGGCTGTTGTTGATAATATCCACCTTGTGGAGGTGGTGGTCC^ ^ 
TATATC C FT G IT G I FCTI rCATAGTGG^CATGACCACC ACCACCACCACTAAACATCCCTCGATCTTGTGTTTGrrGAGAA 
TAATTGGGrrrGTGATTGTGGTACATAACTTTGTTGTGGTTGTTGTGArrGGGGTTGATTATTATAATTTGGTGC^^ 
ACTAGGTTTACCGAAATATTCGTCTTTTCACATTGTTTATATTTATAGGTGGTGTAAT^ 

GATTGAGTATATAGAAGTTGGAAAATTTAATAACAATTAATCTAAACTTGATATAAGATGGATTAGCAATGATAATGAAG 
AAGTAAAGTTGAATCTG 

I QQSYVPQSGP NYSQGTCDRG MFSGGGGGHG HYQQQQGYKA YGPPPFQGGY 
51 YQOGPGGGGG YYOCp^IQC? MYVQQQP5SG GNDSCIKGCL AAICVCCTID 
IQI HLF 



::23r1 2-:i99S:; ; EP983i 06^:9:: 



>207g4 7b9bp in-house 1-7=9 

GCAAGATCTA^ACTCCAGTTTTTTGGTGTAATGTTACACAAGCAAACAAAATATAJUTCGAiUAA 

CTTCTACAAATTACGAAAAATGTTTCACATGTATGAAAAAGCTTTATCTATACTAmCTCCTCCA^ 

AATGATACTGATATCTCCTATTAGGATACAGTTATCTATTATAAGTATAATAATAATCATGGAGATAAATATATATTAAA 

TCGATGGAGTTAACGAGAAAAACAATACAACCCATTTTGCAGCAAAATGAGACATTTCACAGAAAAAAAAACAAGAAAAG 

AOU^TTACTCCATTCAAATAATTCCACAATAAAAAAATAACAAAGAACAAACGTACTAACAAAAAACATCACTAATTTCA 

CTTTGAAAAICTTTACATiCTCAACTTCTAAAGATTAATAATAAGCGATGCATATTCATCAGAAl ^ 

TGCAGCTGAmTGAGCCAGGTGAAACAATTCTTTACTAAAAATCTA'SGAGTTGmATATACAGTATT^ 

CTGTCTCTAACGTATACAAGATAAGATTTGTAATCGGTTAGAATAACAAGAAGGTGTGGTTGTGGACTTGGTGGT^ 

CAAATTTGAATGATATAnGTTTATCTCAAGTATAGC^ATACAAGGGCAAAAGGCTGCAACAAAACAAGAACTTGGAT^ 

GTCGCAArrCTCTTCACCCTTTCAGAATGrrCCTCGTGTATGTGATCAAT 



>226c aCl 766bp in—Kcuse l-7obbp 

4ACGTlATTGTTATATTTTACCiiGGTAiCA03GGACCTCATTATCATTAGnGTCAATTC^^ 

AACACAAGACTTCrnGGTCrrGCTirriAAA'SATAATATATAATCAGGATAAAAGA^ 

CAGGGACGGTAAATCATrCTTCTTCCCTATAJaCCAJUAATCTTATATGTCCCAAoTTA^ 

ATTTACmACAGTGAATCATTAAA:irrTTAATrGAAAGCGA<rrrrA<XTCAATGjCrrCAG^ 

GCACCACCAACAAAAGCACCAGAiGCCTCCATGGATCTGGGTACAATTCCCAAAA^ATCTC^GCAAGATO 

GTCKATATCATCATCATCATCAAiiSATiAGCCAGTATATGCAGAAAAACCXCTTCTCAAGAAGCA^ 

AACCAATAAiAATAACTAAACAACAAGTACCAGCTAAACAAATAGGTACATCTGAVrCATCCTCG^ 

tcgagtcatgataattcatgtt:cgatt:aagtgcagcttctatattttctgattctaaaaata^^ 
gttactcacagatgatatagaggacatattagaggacataga cgat gctgagatatacgatgctgagaaggttaccataa 

CATATATAAGrrCTAAATCATGCTAATACACATTATTAATTATTrG 




>233c_cpl_full SOObp ia-h:use . 1-SOO bp 

GAAiUATC^ACAACAACjUCA^Ci—AAGCCAAGTGATAGTACCAAATCTACTTTAGCAA&TGATCAAACAAGAAAAAC 

ACTTGATCCrrAAACXXCrTGGAA(XACTACAACA<Kn'GATAAAGACACA<rrTTCATCAGACAAAOCATCTCr^^ 

AAGATAAAGAAAGTTCACCATCCCTAGCTGGAAGTTCAACATCAACACCAAGTGGAACTGATAAAAAAACATCTCCTAAA 

AAATrAGTTiCCAATGCTOTCAATiAAGTTGAAAATAATGATGATTTCAAAAAATTCATTAATGAGGCTGAAAAGGAAGC 

TAAAAAATCCAAATCTGGATTGAAAAiATTATTTAACAAGAAGTAGAAGTTGTTTAAATTGTTTCGATATAAATTGTATG 

AATTCCAGTrn-rrrATTITTATTrTiTTTTATTITAGTTTAGTrTTGATTrCAT^^ 

CAATATAATCTTTA T TTTT7 



>22g3 (5') 535bp in-house; 1-535 

AGGTTCCAGTTACCAAnTAGGAAGTGTGTTGCAAGCAGGGCTACCAAATATG 
GGTGGCAACACATATGGTAGTAAGTGC 

TACCAATGTGGGTGCA^AAAAITTTGCCAAGTAATTTGTATGGCAATAACAGA 
AGTGTTGGCGGATTCN.^CTGAGGAAT 

CnTGGTGTGTAAAAA-AAAAGCAATAGCGACTACGCTACAANAGGCAATCNAT 
TATrATTATAAAGTGG.\.\GTrATATAT 

AT^^ITCTCGGGGGGGGGGGGGG^^ITNGGN^^rCCCCCCCCCCCCCCCCAN^^I^ 
TNTCGGCCCNCCCACCNTNCGGCCTTC 

TGGCTCCCCCCCNNCGGGCCNCNNGTAAATNCCTCCACCCNGGGANAANGGNA 
AANGGGGAACNANNA.AGGGGGGACNNN 

NCACCCNATGGGAGGGAAAATCCCNAANNTTTNCCCCCCCNNCCCNGCCNAAN 

CCNCNTGGGGNGGGCCAAANNCNGGGG 

GCmCNCNCCCTNCCCCCCCGCC>rnsrNCCCN>fhWTNCCN^ 



>22g3 (3') 426bp in-house 1-426 

CCCCCATATAACGTTGTCAATAGCAATACTCTGTCGCACCCATAGTGTGCACTT 
CTCGGTGGTATAAAAA.\AATTTTTTC 

TCCCAAAAAAAATCTrCTCCTTTCCACCAC 1 11 1 1 1 CTTCTTCTTCCTTCCCCATT 
CCCTCCCAAATCCCTCATTTTCCC 

CATTTCCCCTACCCTCCTGGCCCTGTATTCCAAAATTTTCTCGGGGNTACGCCC 
CGAAGANAACCTCCCTCCCACCCACC 

CATCTTTGTCNGGNITTCGACCTTCGGCCTCANGGCTCCACCGTCGGGGNTCTTG 
TATATTTGTAGACTCCNGGAAAAAGG 

GAAAAGGGGAGGAAGA.\GGGGGGAAAAAAAAAAAANGGAGGGNGAATCCTT 
TTTNl 1 1 INCCCCCCNTCTCAAACCNAAA 
CCCCNTNTGGGNGGTCN.\ATTAGGGG 




GC 



>35gK 1334bp in-house: 146-669 public: 1-145 PathoSeq: 670-1334 

ACAACGTATAATCGACAGTTTACTATATCTGCTGACTTCAAAACCAATGCATTC 
TTCAAGCGTGCTCTGTCGATTTCTAT 

CATAACATCCACTTTCCGGNGTAATCGGATTACTAAAGCCACAGAATCAAGGT 
G AAC ATC AAGCTTC AACTTC 11 1 C 1 1 G 

GTCCACGAATAATTTTAATTTGGTT^m•SKKGSMA^IKGCTTTCTACRGTAGGTT 
TGAATCnTCCAACATTGTCTTTGCA 

TAGAAACMGCACCAGACAAGAAACATGTCCACTCGACCATCAACYTSKGGGT 
AWWGACAAAGTWAATCTGTCTGGATCCT 

TTTCATCCAGTTTCCCTGCATKGGAWACAAGTNTGTCCCGCACAGTTAAGACT 
Gil 1 n ATTTTSKTGGTATTAGACTCA 

TCAAGTTCCGAAGGAGAGGCATCATTTARGGGWATAGACTCCGCTGAGTTAAT 
ACTGGATAAATCACTTATTTCAGATTC 

ACTGACTTGTWCTTCAGTGACCTTATCAAAATCCTCAATGTACTCSGARGCGTW 
TTCMCTCMATGTGAAGGCTTTTAAAA 

GGGCAACRCTGGTTYCAAAATGC 111 C 1 IGCRAGTTTGTACKTGACAGAAAAA 
TCAAAAACYTTGAAAGATATACCTCTT 



CTAAAGTCiri 1 AAATCAA 1 1 1 C 1 1 NTCCTAATnTTCATCATATAGCTTATGAC 
TTGGCAAACCCTCCTTACATACCAT 

ATCCATTACAATGCTAGAAATGTCAATCTTCACTGACGATATAAAGGATGGAA 
GAACTTCAAATAATTTTATAAACTCAG 

GATTGGCTGGTGTATCTGCTGCAGGAGCTCCAGATTTATTGTCCATTTGCTCAC 

TCCATGGACATACATTATTAACGTCC 

ATC 111 11 CCATTCTTC AAAlll CI 1 CGGTGAAATAAATTCGTTGACGRWTTTTA 
AACAGACGTACAATGTGAAAGATAA 

GATCATrAGCAGAGAGCAATTCGAGACTCTTGCTTGAAAGTTTGATTGACACG 
TTTTGTTGTAACATATTGTAGGTGGCT 

AAAAGATTGACTTWRGTAAAATGRAACTTATTAACCCTGGGCCCTCACATTTC 
ACAlllll CATCTTAA.\CAAAGKGGTT 

CAAAGKGGAACTTGGTTTGGATCCYTrAWTGGAAWATTTCYCAGKRAATACTT 

TCAAAATCAACTCCAGGAGAGCCACAG 

TGATAATTGAATTGGATrrAGATAAGCGGTTAAACTrCCCAATTTCAGTTTrAC 

CAAACTCTGGTAAATGAAGGTTAAGT 

TTTGTGTCCACCACAACAAGTTTACTAAAAACAGCCTTGAGCATrrTGGAGGCA 

>36g2 (SO 520bp in-house: 1-520 

CGTATAGAGAATAATCCGTTGAAATTGATTGTTCAATCATTATTGTATCTITTCC 
CI 1 rniTlGTTCTAACCATAATGT 

TAGAATAATTAGAAATrGTCTAAATATATATTCAGTTTAACAAAAAACAGAAT 
GCTTGCAATAAGATTTGATTTCTAATT 

ACTAATCGTrAATATTrAGTTTGGTGGGGTTTTATTTATCGAAGATGTAGCATT 
ATTTGTATCNAATAGATAAAGAAACT 

TGAATTAAATGGCNTAATTTGTTGCAATAGTAAAAAAGAAGAAAAGTGGTAAG 

GAGTGAGTGAAAATATTTnTGCCCCA 

ATTTGAGTNGAAATCTrACACCNAAAAGTTrGGACNAAAAAGTTTTTACTAAA 

ATCTGANAATCTNCCTGAATAGAACCG 

ATCATCCNCATNTCCGATTTCNTGAGGANAGATAGTGGCCCCACCTCNTGGTG 
ATTAGAAGGAGCNCCCATGTTTTACAA 

TATCTATATCCAGAATAACNTGTTTGTGACCTCNCCCCNG ^ . 
>36g2 (S") 472bp in-house: 1-472 

CTCTATATATAGTGAA.\TATAACATCAAATAATGTACAAAAAAGTATAATAAA 
TTGATTTAGAAATGAGAAAAAGAAAAA 

AACTTGAAGTAGTGAAGATATATrTGTTGGCTATCTTTCTTGGTATGGCTCAAT 
TCAGCCAATCTrGGATGA.AAGGTTGG 

AGTTrTAGTrrCGTGGTTTATTGATTTGTAAGTACTTTCGGGCTAGAAAGTTNA 
CAAACATGATTAATCTTGATATANAT 

ATTTGTTAAACATTTGOTGCTCCl^CTTAATCNCCCAAAAAGTTTGGGNCACTA 

TCTTTCCNCCNGAAATCTGTATATGT 

TGA^^•GANCCG^^•CCAT^CCTGTTNA^^ITTCNGA^^^TTAGTTAAAACC1 1 n iG 

TCCCAACCTTTTGGGGTTAGANTTCN 

NCCCCA>n-GTTGCCNTs'.A.AATATTNCNCNCCNCCCTNCCCCTTTCCCC>rmTAC 

NAATGCACCAAGTAAGCG 



>38gl 1348bp in-house: 183-940 PathoSeq: 1-182/941-1348 



TCTCTGGTATAACTTGCACTACCTCATCGCTACCCCGGA 1 1 1 l i I Ti l GGTATGA 
TCTACACGTCCTCATCGCTACCCCA 

GA l 1 ill 1 n CTGGTGCGCCGGACACGCCCTCCGGTCCGCACCGAAAACCGGGG 
TAATCTCCGTCGGAGATACACATCCG 

CGGACACAAAATCAGATGAGCTACCACCGAAAATTCCGAAATTTCAAAAACTC 

AAAATCCCTAAAAACA.\ACTATCCAGA 

NATTATTGCCATGCCCTGAGGATGAGTTTAGTTTTTTAAi 1 1 1 iGAAAAATGTC 

CAAAACTGGTTGTGCTGTATAGGANG 

GGTAAGAATTTGCCATTCTGCCCCTTTGGGTGGGTCAGTCNAAAAAAGANGTA 
TCACTCTGGTTCNAACGGGAAACAACN 

NAAAATGGGATrAAANn^ATCTCCAGAMCAAACTTAGCITMWWACACCCAY 

TTTAGTTGTACTSGYG'^V'RCCMAAMMCMAA 

TTTTCCATTTTGTTTGGGGANGGGAATTTARACCAAAWl 111 1 i 1 1 1 IGAAATTT 

CGCTMAGTGTYMAGAMCCSCAAAAG 

TCACCmTTTCGTTTTCMNICYACGGCARARGCYCACCGGTrTTKYKTGGKGS 

MCRGCCMAATTGAWTTTGTGGGTGSGC 

ACGKGGAAAAACAGTTXGTTAGTGGACACGTTTTTGCAGTGTGAAACTGCGCT 
CGGAGGTACTATATGCGAAAGCAGAAA 

AGACAATTGCAAGAATACAGAGAGTTCTTCTCTGGGCTANNGCAATGTGTTTA 
AGGCCAAGTCGACGAGTGGGGAGAGTC 

TGGAAGTGATATACACATCACGACCTACTTTATACGCTACGTTCGGCATGGGC 
GAGCCACTGTACGGTGGCAAGCCTGAA 

CAGTCCCACACCAGATATCTAACGATTCTGTGTATGGGCACTGATGGGATITAG 
TGGATTACTAGCTGATAGCAAGTATT 

GAAAACTAAAACCCGACTCGGGGGTATGCCTTGGCAAGTAGCCGGAGTAAAAT 
CTGTGACTTTGCTGAGTGTAACTCCCT 

CCATGGTTGGCGATGTTCGACGTGCGCGGCAGTTCTTGTCGTATCACAGTCGCA 
CGGACACCACACCGGGAGAATCTTAA 

GAGGGCTATATGGATGTGGAACGGTTTGC1TGCTGTGGTAAAACACTGGCGGG 

CGAGCCGACGTTCCACGGACACAGCAA 

TGTGTTTGCAACCAAATAAATAACTTGTACGGTITGAACGTGTmTGGCTGCT 

CCTTCCAGTTCTTGGCGGGAGAAGCT 

TGGGCGCGGGAAGACCACTACTACGTAGTTATCTGGTTGATCCTGCCAGTAGT 
CATATGCTTGTCTCA 

f'6 "° 



>60gK 990bp in-house: 445-752 public: 1-140/753-990 PathoSeq: 141-444 

ATTACCGATCCGTCGGATTTTAAAACCACAAAATTGCCTGCATTAGCAGAGCT 
AGATATTTTCATAGGGTGCTATATATG 

CAAAGATCTATTGAATG CACCC GTGAGGACACAATGTGATCACACGTACTGTT 
CACAATGTATACGAGAATnTTACTTC 

GAGATAATAGATGTCCGCTTTCfAAAACAGAGGTTTTTGAAAGTGGTCTAAAA 
CGTGATCCATTGTTAGAAGAGATCGTC 

ATTAGTTATGCCTCCCTTAGGCCTCATTGATTACGATTATTGGAGATTGAAAAG 
GTGGAATCGAAGCAAGAGGTAGATCG 

/7 




TGAGAAATCAGCCAATGAGTCAGCGCTGAATGGTAATAGAAATGTAAACAAC 
GATGTTGACGAAACTGTGCGCGTTAAAG 

ATCAACTGAATGCAGATAAACTAGGTGAAGAAAAAGGGCAAGCTCAACATGG 
GGAACAAGTNAAACGAGCAGACTACTGA 

AGTTATTCTGTTGCTATCTGATGATGAAGAGAATGGTTCTGATAGCCTAGTAAA 
ATGTCCTAl l lUl 1 1 IGAGAGAATGO 

AATTAGATGTACTACAGGOAAAGCNTATTGACGACTGTCTAAGTGGAAAGAGC 

ACGAAGAGGACGCCTACAGACATTTTA 

TCCCCAAAAGCCCAACGACCGAAGCAAATCACCTCCTTTTTCCAACCAACAAT 
AGATACCANAACNCCTTCCCCCACCTA 

CCAGTTNNGGCGTCNACAACTCCCACAGCAACTCCGACAACTACATTGTTGAA 
AGCAAACGTCTCATCTCCATCCCAAGT 

GGCGCAAAGTACAGTAAACAAGGGCAAGCCATTACCTAAACTCGATATCAGCA 
GCrTGAGTACTAAAA.AAATAAAAGCdA 

AGTTGAGTGATATGAAACTACCAACAACAGGTAGTAGGAATGAAATGGAAGC 

CAGATACTAGCATTACTATGTGATTTAT 

AATGCCAACCTTGACACCAATCATCCTGTA 



23.12-1998: 




>64gB 627bp in-house: 1-627 

TNCANCCTNCCATNCNCCCAGGChnsiNGCCACCCCNGCCGNNCCCCCmTSITTTC 
CCCCCCTCCTTNGTNGCCCTCNNGGTG 

GTGTTTGTGGTGTGACNAATAAANATGGTNTATCATTAGAANAGGACATTGCN 
NCGGAAATGACTGTCGACAATAAAGAA 

GCAAATATATACAATGGATTATGAANGTGCTAGGATGGATTTGAAAGTTTATC 
TGGGTTTATTCCAATGTAAAAATTATT 

TGTAATTGATATGGCrAATTATTTTGCrCNATATNTATCACAAAAAAATGATTA 
AGTTCGAAATGAAATTGGCNTCCATA 

TATAAAATTTCTGACAGGAAGAGAAAATTCANGACNTGTTGCCCNAAAAAAAA 
AACnTACCCCNCNTCNANTCNTGTNN 

GCCGGNNGl lllll AAAATTTNAIWCTT 

GAATATGAACCCAA>JNTTTGNNTTCN1 111 1 NCCACNCCCCCTTCAAATTTNAT 
TCCATGTTCCCAAGANNAGGGNGGNG 

GGGGNGGTTCChn^CTTTTAAACCNCCCCCCCCGGGTGGNGGGGNCCGT>rrTNT 
TTCCGGNGGGGCNT 



>8c_cp 890bp in-house: 287-890 public: 1-124/154-286 PathoSeq: 125-153 

ATGCAATTCTCATCCGGTGTCGTCTTATCCGCTGTTGCTGGGTCCGCTTTGGCTG 
CTTACTCCAACTCCACTGTTACTGG 

CATTCAAACCACTGTGTCACCATCACTTCATGTGAAGAAAACAAATGTCACGG 
AAACTGGAAGG1TACCACTGGTGTTAC 

CACCGTCACTGAAGTTGACACTACGTACACCACCTACTGCCCATTGTCAACCAC 
TGAAGCTCCAGCTCCATCTACTGCTA 

CTGATGTTTCTACCACCGTTGTCACCATCACCTCATGTGAAGAAGACAAATGTC 
ATGAAACCGCTGTCACCACCGGTGTC 



ACCACTGTCACTGAAGGTACTACCATCTACACTACCTACTGCCCATTGCCATCT 
ACTGAACXrrCCAGGTCCAGCTCCATC 

TACTGCTGAAGAATCTAAACCAGCTGAATCTTCCCCAGTTCCAACCACCGCTGC 
TGAATCTTCCCCAGCTA.\AACTACTG 

CTGCTGAATCTTCCCCAGCTCAAGAAACCACTCCAAAGACCGTTGCTGCTGAAT 
CTTCTTCAGCTGAAACTACTGCTCCA 

GCTGTCTCTACCGCTGAAGCCGGTGCTGCTGCTAACGCTGTCCCAGTTGCTGCT 

GGTTTGTrGGCTTTGGCTGCTTTGTT 

TTAAGTrrATTAGAGCITAAATCAAATATTTACAAACAAAATTTTCATTTTCCC 
CCCTTTCCCl'l lCl ICATTCTTCAAA 

AAAGGGTrATTTACTATrAATTGATAAATTTATGGTTrCATGTTAATTTACCCTT 

TrCTTTATAAACATTGGTATTATTA 

TrATCATCATTAGmTTATITATATTTTCGTGAGTTTrrCGG>nTTAATrAATT^ 

nTGGATACATATTAAAAATTTAT 



AC<:ACACACTTTTAAGAGCACAGiAA^AiGAGCACTTATTTCTAiiGACCGCATGTrTGGTAGCACACACTT^ 



TTGGTACTAG 




>8S33 4aibp in-house: 1-431 




C 




>66g4 579bp in-house: 1-579 

CCCCGTTAACCACTTCTAGGGTATACCATTTCATCTGACTGAATAACTGGTTAG 
TCGATTTGTTGTTGAAGAAAAGTGAC 

CACCTAGTTITTTCTGCCAACATTTTTTGCGATGAGCCGTCGACGCGTTGTCTrT 
TTCTACCCCACGTTTAACAATCTTG 

CCAGTCAATTCCCTAGCCAAATAAACTTTAGACTCACAACTCTAACACTGACTC 
GTGCCCCCCTGTTTAAACTCTAAATT 

ACTTCACAGAGCCTTTACTACCTTAAATTTARGRTIAVTSKAKKGTTTCTGI 1111 
TTGCAAATCACCCTGACTY Glll 11 

TTTTCAGCCAGGl 1 1 11 CGTTAAAATCTGACCAAAAAATTTACRACTCCTATWT 
TTAAAACTCYAAAWWACAATTAAAAC 

TCAA TTCAGACAAGTCCnrCTGCTC ATTCTGAGTCTTCTCTATTGTCl 111 GACT 
mTGTGTGTGACTATnTCATGAT 

CACCCCGTTTCTTGCATTTTTTTCAGTCAACTTTTTCTCAAAATCAAGCCAAAAA 

AACACACCTTTAACTACCTATACAA 

CGCAAACCTATTCAAAACA 




>NDI (17c_cp) 807bp in-house: 1-614 PathoSeq: 615-807 

AACCTATTCCATAATGTTTACTAGATCATTGATTAAAGGTGGTGGCAGACTTGC 
TACTACCAGATCATTGGTCAACAACT 

CTACTAGTTTGGTTTTAAAAAATCAATTTAAGAAATATTCAACATCAACTCCTC 
CTAAGGTTGCCAAATCAAAATCTTCG 

ACAATTGGTAAAATATTCAGATACACTTTTTACACTGCTGTGATATCGGTTATT 
GGTTCTGCCGGTTTGATCGGTTACAA 

AATTTACGAAGAGTCTCAACCTGTTGATCAAGTGAAACAAACACCATTGnTCC 
TAATGGTGAAAAAAAGAAAACTTTAG 

TTATTTTGGGTTCTGGTTGGGGTGCTATTTCATTATTGAAAAACTTGGATACCA 
CCTrGTATAATGTrGNTATTGTCTCC 

CCAAGAAACTATTTCCTTTTCACCCCATTGTTACCATCTGTTCCTACCGGTACTG 
TTGAATTGAGATCTATTATTGAACC 

TGTCAGATCAGTCACCAGAAGATGCCCTGGCAAGTTATTTACCTTGAAGCAGA 
AGCTACAAATATNAACCCCTAAAACTA 

ATGAGTTGACACTTAACAAAGTACTACTGTCCGTTCTGGTCATTCTGGTAAAAA 
TACTTCCTCTTCTAAATCAACTGTTG 

CCGAATACACTGGGGTTGAAGAAATCACTACCACCTTGAATTATGACTATTTA 

GTTGTrGGTGTTGGTGCTCAAACAATN 

CTANTTTTCGGNAATCCTGGGAGNCGCNTGAGGAANTTCAACCCC i 1 1 1 1 i GAA 

AGAANGNCCAGTGGANGCCNTCTGCN 
AATTAGA 



>HOLl (409c5) part2 762bp PathoSeq: 1-762 



GATCAGAATAATGAGGACTTTATACCTGGAACACTCAATATCTATTCCTTGGAA 
GTTGACTCTGAAGATGAAAACGTGAG 

TCATTACGATGCTTCCAGTCGACCAAAAGTGAAAACAAAAGGCAATATAATCC 
TCTTCCCACAACCATCGAATTCATGCA 

ATGATCCATTAAATTGGAGTAAATGGAGAAAGCTAAGTAACl 1 1 1 1 lATTGTCA 
TTTTTATTACTGCrnTACAGCAGCt 

ACTTCAAATGACGCTGGATCAATTCAAGATTCACTTAATGAAAAATATGGAAT 
TAGTTACGACGCAATGAATACAGGGGC 

AGGCGTTTTAi'llil GGGTATTGGATGGGGTACTTTCnTil AACACCTGCTTCG 
TCGTTATATGGTCGAAAAATAACAT 

ACTTTATATGTATCrrTCTTGGTTTATTAGGCGCTGTTTGGTTTGCCTTGGTr/^ 
AAGCACTTCCGACTCAATTTGGTCG 

CAATTGTTTGTrGGTATTAGTGAGAGTTGTGCTGAAGCTCAAGTACAATTAAGT 
TTATCAGAACTTTATnTGCCCATAA 

CCTTGGTTCTGTGCTTACGTCCTATATTGTTGCAACTTCCGTAGGTACTTACTTA 
GGACCTTTAATTGCAGCCnTATTG 

TTCAAAACATrGGTTTTAGATGGGTTGGTTGGATTGCAGCAATTATTAGTGGTG 
CATTATJOTTCGTAATTG illll'l GT 

TrAGATG^feAACCTATTTTGATCGAGCAAAGTrTACCAAGCCA 




>GAL2 (360c6) 1004bp in-house: 625-1004 PathoSeq: 1-624 



TCCATirrCCCTITrCCTCTTTTTCTACATCATCCTCACANCAATTTCAAATATG 
TCTCAAGACAACGTCTCATCAACAT 

CTACAGCTGAGGCTGTAAATAATGAAATCAAAGTCAAAGATGAATTTCCACAA 
GAAGAACAAGCTCATACTAGTTTAGAA 

GATAAACCAGTGAGTGCATACATTGGTATCATCATTATGTGTTTCCTTATTGCC 

TTTGGTGGl'l l IGl'l l ICGGTTTCGA 

TACTGGTACCATrrcrGGTTTTATTAATATGTCTGACTrnTAGAAAGATTCGGT 

GGTACTAAAGCTGACGGTACTCTTT 

ACTTTTCCAATGTCAGAACTGGTTTAATGATTGGTTTGTTCAACGCTGGTTGTG 
CCATTGGTGMWTTATYCTTGTCYAAA 

GTCGGTGATATGTATGGTAGAAGAGTTGGTATCATGACTGCTATGATrGYCTAT 
ATrGTTGGTATTATTGTTCAAATTGC 

TTCTCAACATGCTTGGTATCAAGTCATGATTGGTAGAATTATYACTGGTCTTGC 
CGTYGGTATGTrATCAGTTTTATGTC 




CTITGTrCATTTCCGAGGTTTCTCCAAAACATTTGAGAGGTAC-nTGGTGTGCTG 
TTTCCAATTGATGATTACCTTGGGT 

ATCTTC>nrGGGNTATrGGCTACCTATGGTACTAAGAGTTACTCAGACTCTAGAC 
AATGGAGAATTCCATTAGGTTTATGT 

TrCGCCTGGGCTTTATGTTrGGTrGCTGGTATGGTTAGAATGCCAGAATCTCCA 
CGTTACCTTGTCGGTAAAGACAGAAT 

TGAAGATGCTAAAATGTCACTTGCCAAAACTAACAAGGTTTCTCCAGAGGACC 
CAGCATTATACCGTGAACTTCAATTAA 

TCCAAGCTGGTGTTGAAAGAGAAAGATrGGCCGGTAAAGCATCTTGGGGTACT 
TTATTCAATGGTAAACCAAGAATCTTT 

GAAAGAGTTATTGTrGGTGTCATGTrACAAGCCTTACAACAATT ^ ^, x 



>KGD2 (98c_cp) 334bp in-housc: 139-334- public: 1-138 

TTCTAACAACAACATCTrrCTTGGATCnTCAATCAATTCCTrGATGGTTCTrAAG 
AAAATAACAGCTTCACGACCGTCAA 

CTACTCTGTGGTCGTA\GTCAATGCTAAGTACATCATTGGTCTAGAAACGATTT 
GTCCGTrAACAGNAATrGGTCTTTNT 

TTAAAANTGTGTAAACCAAATACGGNAGTTTAANGCAl 111 1 ATAATTGGGGT 
ACAGTATAATGATCCAATAACACNGNC 

ATTANAAATAGTGAAAGAACCNCCGGTCATATCTTACAAAGTCAATTTACNAT 
TTCTGGCnTNTTACNCAAATrANANA 

TTTCCmTNAATA . _ 




PrMteiloiilrifi 



>RNR1 (38) 2562bp in-house: 1-2562 



ATGTATGTrTATAAGAGAGATGGCCGTAAAGAGCCAGTACGTTTCGACAAAAT 
CACTGCCAGAGTTCAAAGATTATGTTA 

CGGnTGAATCCAAACCACGTTGAACCAGTTGCTATTACCCAAAAAGTrATATC 
AGGTGnTACCAGGGGGTTACTACTA 

TTGAGTTGGACAACTTGGCTGCAGAAATTGCTGCTACAATGACAACAATTCAC 
CCAGATTACGCTGTCTTAGCCGCTAGA 

ATTGCCGTATCAAATTTACATAAGCAAACCACCAAACAGTATTCCAAAGTGTC 
TAAGGATTTATATGAATACATTAATCC 

TAAGACTGGGTTACACTCTCCTATGATTTCCAAGGAAACCTACGACATCATTAT 
GGAACACGAAGATGA.MTAAACTCAG 



CCATTGTTTACGACAGAGATTTTAACTACAATTATTTTGGGTTCAAGACTTTGG 
AAAGATCATATTTGTTACGTATCAAC 

GGTAAGGTTGCTGAAAGACCACAACATrTGATCATGAGGGTrGCTGTCGGTAT 
TCACGGTAATGATATACCAAGGGTCAT 

TGAAACCTATAACTTGATGTCrCAAAGATTCITCACCCATGGTTCTCCTTGTrrA 
TTTAACGCTGGTACACCA.\GACCAC 

AAATGTCCTCATGTTTCTTGCTTGCTATGAAGGATGATrCTATTGAAGGTATrT 
ACGACACnTGAAATCGTGTGCTTTG 

ATCTCAAAAAGTGCTGGAGGAATCGGTTTACACATCCACAACATTCGTTCTACC 
GGTGCTTACATTGCTGGTACCAATGG 

TACTTCTAATGGTATTATTCCAATGGTAAGAGTATTCAATAACACTGCACGTTA 

TGTCGACCAAGGTGGTAACAAGAGAC 

CTGGTGCCTTTGCCTTGTACTTAGAACCATGGCACAGTGACATTTTTGATTTCA 
TTGATATTAGAAAGAATCACGGTAAA 

GAAGAAATCAGAGCCAGAGATTTGTTCCCAGCTTTGTGGATTCCAGATTTGTTC 
ATGAAAAGAGTTGAACAAAATGGTGA 

CrGGACnTTATTCTCACCAAATGAGGCCCCAGGCITGGCTGATGTTTATGGTGA 
CGAATTGGAAGAATTATACACCAAAT 

ACGAAAAAGAAAACCGTGGTAGACAGACCATCAAAGCTCAAAAATTGTGGTA 
TGCTATnTGGGAGCCCAAACTGAAACA 

GGTACCCCATTTATGTTATATAAAGATTCATGTAACAACAAATCCAACCAAAA 
GAACTrGGGTATTATCA-VATCTrCCAA 

CTTGTGTTGTGAAATTGTTGAATATTCTGCTCCAGATGAAGTTGCTGnTGTAA 
CTTGGCTrCCATTGCCTTGCCATCAT 

TTGTTGAAAATGATGAAAAAAGTACTTGGTACAACTTTGACAAATTACATCAG 
GTCACTAAGGTTGTCACCCGTAACTTG 

AACAGAGTTATTGACCGTAACCATTACCCAGTCCCAGAAGCTGAAAGATCAAA 
CATGAGACACAGACCAATTGCTTTGGG 

TGTTCAAGGTTTGGCTGATGCCTTTATGGAATTGAGATTACCATTTGACTCTCA 
AGAAGCTAGAGAATTGA^CATTCAAA 

TTTTTGAGACTATCTACCATGCTGCTGTTGAAGCTTCAATTGAATTGGCTAAAG 
AAGAAGGTGCCTACGAAACCTATCCA 

GGTTCTCCAGCCTCTCAAGGTTTATTACAATTTGATTTGTGGAACAGAAAACCA 
ACTGAATTATGGGATTGGGATACATT 

AAAACAAGATTrGGCCAAACATGGTATGAGAAACTCCTrGTTGGTTGCACCAA 
TGCCTACTGCTTCCACATCACAAATTT 

TGGGTAACAATGAATGTTTTGAACCATACACTTCTAACATTTACTCTAGAAGAG 
TATTAGCTGGAGAATTCCAAAITGTC 

AATCCATATTTATrGAAOGACTTGGTTGATTTGGGTGTCTGGAACGACGCTATG 
AAAAGTAGTATTATTGCTA.ACAATGG 

TTCTATCCAAGCCTTACC.A.A.ACATCCCTGATGAAATCAAGGCATTGTACAAAA 
CTGTCTGGGAAATCTCACA-AAAACATA 

TTATCGACATGGCTGCTGATAGAGCAGCATTTATTGATCAATCTCAATCATTAA 
ACATTCACATCAAAGATCCAACAATG 

GGTAAATTAACCAGTATGCACTTCTACGGTTGGAAGAAAGGTTTAAAGACTGG 
TATGTACTACTTAAGAACACAAGCTGC 

CAGTGCTGCTATTCAAnTACCATTGATCAAAAGATTGCTGAGACTGCCGGTCA 
TACGGTTGC A A ACTTGGAC A AATTAA 




11 



ACATTAAGAAATATGTTAACAAAGGAAGAGTTGAGAGTGAGAATACCAGTGAT 
GCTCCATACAAGTCACCATCAACCGAA 

CCAACCTCATTAGAAAGTTCAGTTGCTGATTTGAAAATAAAAGATGAAGGTGA 
AAAGCCAGCTGAAGACAAAACCATTGA 

AGAACTCGAAAATGACATTTATAGTGCCAAAGTTATCGCATGTGCTATTGATA 

ATCCAGAATCTTGTACAATGTGTTCTG 

GT 





>SAM2 (36) 1 ISSbp in-house: 1-1 155 

ATGACTACTTCCAAGGAAACITTCCITrTCACTTCAGAATCCGTTGGTGAAGGT 
CACCCAGATAAGATTTGTGACCAAGT 

CTCCGATGCCATTTTAGATGCTTGTTTAGCTGTrGATCCATTGTCAAAAGTTGCT 
TGTGAAACTGCTGCCAAAACCGGTA 

TGATTATGGTTTTTGGTGAAATTACCACTAAAGCTCAATTGGATTATCAAAAAA 
TCATTAGAGACACCATTAAACACATT 

GGTTACGACGATTCTGAAAAAGGTTTrGATTACAAGACTTGTAACGTCTTGGTr 
GCAATTGAACAACAATCTCCAGATAT 

TGCTCAAGGTTTACATTACGAAAAAGCTTTGGAAGAGTTGGGTGCTGGTGATC 
AAGGTATTATGTTTGGTTATGCCACCG 

ATGAAACCGATGAAAAATTGCCATTGACCATrTTATTGGCCCACAAATTGAAT 
GCTGCCTTGGCTTCTGCCAGAAGATCA 

GGTTCCTTGCCATGGTrGAGACCAGATACCAAAACCCAAGTCACCATCGAGTA 
TGAAAAAGATGGTGGTGCAGTTATCCC 

AAAAAGAGTCGACACAATTGTTATTTCCACTCAACATGCCGAAGAAATCACCA 
CCGAAAATTTGAGAAAAGAAATTATTG 

AACATATCATCAAGCA-AGTCATCCCAGAACATTTATTAGACGACAAAACTATC 
TACCACATTCAGCCATCAGGCAGATTC 

GTCATrGGTGGTCCCCAAGGTGATGCTGGnTGACTGGTAGAAAGATCATTGTT 
GACACCTATGGTGGTTGGGGTGCACA 

TGGTGGTGGTGCCTrCTCAGGCAAGGATTTCTCCAAAGTTGATAGGTCTGCTGC 
TTATGCCGCTCGGTGGGTTGCTAAGT 

CGTTGGTGACCGCCGGATTGGCCAAAAGGGCCTTGGTGCAGTTCTCCTATGCTA 
TTGGGGTTGCTGAACCCACCAGCATT 

TATATAGACACCTATGGGACATCTAAATTGAGCACCGAAGCCCTTGTAGAAAT 

TATCAAGAATAATTTTGACTTACGCCC 

TGGCGTAATrGTAAAAGAATTAGATTTGGCTCGTCCTATTTATTTTAAAACCGC 

TTCTTACGGACATTTTACTAACCAAG 

AAAATTCTTGGGAACAACCAAAAAAATTAAAATTT 



^3 27 



>135g 8S9bp in-house: 1-859 

CCn^ATlATTATCTTAAAACCGTAGATUGCiA^ATTTATCTTiTGAAATGTTClGCGATAAAGAAAGAAA^ 
GTACCACGAGGACT Gl 1 1 1 i GAGAAAAACAACTCGTAiUTTAATGAATCrrAGTTTCTCTATACTTGAATAA i ■ 1 I IGAGT 
TTTCTGGAAAAGACACCTGTTCCAGTTTCAAATTAAACAAGAATGTGAAAAGAATAAAATTTGATTr^^ 
AATAATCCAGGAAAACTCAATTTTCGTAATTGGCAACTTGTCCGAGTGGTTAAGGAGAAAGATTAGJUUTCT^^ 
TTGCCCGCGCAGGTTCGAGTCCTGCAGTTGTCGTTAT IT T l TTrGGTrrACrCTCTATTTTAAAATTTA AAACTAA TCAA 
CTGAAACTGGAGTACCTGCCATGATATGAGTAAATAC i i 1 I I TGATATTAAAAATCTATATAAAACTCCCTATTTATTrr 
TTAATTTAAACCCAGATATTGTCCCAATAATA G TTTTTTGTTTGAACrrATTGCTTTGTATGAACCT^ 
TTTCCAATTTCATACTCTCTTAGTTGGCCACATCAGTGGCTCATTGAATAATTCTGATCTTGAAGTGTACC 
CTGACAAAACTGCACACGGACCCAGTCAATAGCATTATAGATATTTTGAnTAAAGTTCACOT 

TATTGGCCATCTCATCTCATCTTCrrGCAiTAAATTCTTAAACGCTA Li I i I ' lC TCAAACCTTATTATCCCTCTAGATAC 
TCTTCCAAATCTTCAGGTTCAAATATCACTTTAACCATCAATGAACAACTAGGGCAAAC 




n9<fz 

X X rs 

1 MSITVTF^KS ?ST:<XJ^\PAF OrHLEFS^QC £3CX?AI2KAA LAv-pVFS'/DN 
XXX R 

51 q::fvlirdia KvVv'csr^ssYQ livklvkcaw iexsqilxto kdl:ikelfel 

101 IJilSEADCKZ C:,FVISL?Ly YSRIZNKiC/F YVI^RZPEQPK VSKAPTCEKP 
151 ASWAAESDD :::-lDD3E£DE ^/D2DM^S3ND KSGELSKGYK HMHKDKPKVI 
201 NDDRVTIGQV rnwYGwDPST NSMSXLNY^K NFGVSGVRFI* 

25i PKSKXSVAER 21V1.MANNY>: DMHZITEKTES KPKXSFRKPI GKSKXHNI*QI 

is T 

m s 

3C1 DFNSI2:;.SES •/rrOC<^FI?D FSIKHlda"? NYY^/TSMHQS LPLSFNTKNL 

X 

351 NATSNSSYLF »C::Vi<I!<CK5 IQKLVTNSOT ZNYKKTrC^ri TXTYF.GPG3G 
401 N'YXDGAIMNK INKIKLSSKK K??HJCU:VSN NKRYNKSLXG t^^'KSKFDKN? 
451 V2YLLSEQRK YTI::Y£NI:EI tHNSLQFK'/li L:rry^G\J^QB TvmrrcKrKLi 

X fs 

501 IDFEQUCALQ K£A>:ri.EEKK LDAA^HQ^WA ESEKIiP.QERL KLVFBI5BP1J2 

XX I; X * X XX 

S51 FECLQS2FGQ Pv.KXI-EEKLR FJlCLiAS LSD SFEADSINDD ESEt-AQIQQD 

missing sequence 

eOl FESSANALK? KFEArJ'J'j:!! WPA=>P?C?rE rPQlD:*NNK? SLPT^TfPEII 

missing sa<ri9r.ce 
651 P.MiPLELRGV VPiSXZZL?? YFER?>rXEYI, TR.VR3YPIA2I 

miss in? 
7Ci KlSGWiQ 



>>>^;-:.;■:•^x-:ox:^^x-x-:*^:-:-:->:■^ 



i.£P.9831,Q694;Ql 



1^ 



fs S 

1 QOSYV^QSQP N'VSCQTQD.^G MFSGG^JGGHO KYQQQQG'rNA yG????QOGY 

ambiguities 
X w w wx 

51 YQQgPvG'JGG YyC«?:CQ<2^? M^xVgQQPFSG G>CSCL;«K:L JJxLCVCCnr) 
101 MLF 



22283 

I MRRREIERK:-: iv:-:XP^QKCK SHEAfCROIRI QQLSEQDS?5 w^^tkxhexvf 
31 KKARSTNSGA DETGLMSCK^ FDDSAY3PDY LFSEMI-WNXP NHPDTNHECTK 
101 XYT£j:\r/EKL 0S?5NDTSAY H3SFHDETNI QKEIQIPEND EYVPQMKA^S 

K D VR fs C 

151 SVNNrrrXPAC .'--.keslstss nkkrxfetad vg^/xoldsfx xaqtrniwki 

p 

201 Q^/SDNPW>r/Y ?7>CO:K?.LE:r PEGICXCRCQ 



/^J 3/ 



1 ITDFSOJKtT K1PA:_\EZS: iKKCYICKi- Hrx'CSaClRE 

51 FlLRDMaCPt CXTHVrESGL KRDFLLESrv' ISVASLRPHL I*RLL21BKVS 
ICl SKC2V::51HK3 ANESALWGN^ N'yi^'NDVDBr/ RVXDQLNADK LGEEKG<?AOH 

G fs X 
151 -//iQWSQTTS VILLI £DC£S NGSDSI**/I':C? IC^BRMELDV LCGKKIDDCL 

201 CGKSTKRTPT CZL3?:<A.:<aP XwITSrFKPC ICTKTPSPPT SKASTTPTAC 

SON IK M 

2 51 PCTTLLKAlxV AS?SPVAQST '/HXGXFLPKL D?SSU3?QKX KAKiSDuSCP 



•01 TCG3RN=MEA ?YlKry.'IVK A^TLDS'KHPV' 



£E-SB 














G 






fs 


fs 

X5 A 




m 






» 




1 




A*/ AG SA I-AA V 


SN'STVTDIOr 


rryriTSCEZ 




51 


GV^r/T2VXC 


r/TTYC?LST 


I'SAPAPSTAT 


:;v3TT^/^y^iT 


SCESOKCHET 


101 


AVT03VTTVT 




PLFS7BA PGP 


APSTASESKP 


AESSrVPTTA 


151 


AESSPAKTTA 


AESSPAQETT 


PKTVAAESSS 


ASTTAPAVST 


aeagaaa:lhV 


201 


PVAAGLIiALA 











31 



17c <?P 



1 ??>?/AKJacS TrOKIFRYTF YTAVISVIGS AGLIGYKIYE ESQPVDQVKQ 

X 

51 TPirprcGsxK :<Tivizx3SGW GArsiLKM^o tt:.yn''A':vs prsyflftpl 

fs X £8 fd 

101 -rSVFTGCVi L?.SIIEPVRS VTRRCPGQVI YI^EASATKIN PKTHELTLXQ 

151 S'T-TA'SGKc^ jCTCSSSKSrJ' A2YTGVE2IT Tn.NYL:YL\Af GVCAQTILIF 

XXX XX XX X 

201 GNFGR7MRK? KrrrS^.TSG SELQIR 



Dart2 



1 DQNNSDFIPO T-VI'/SLEV:/ SECEra'SKYD ASSKPIT/CTK GtJIILF?CPS 

51 NSCMDPLt^'.v^ K'ATL^LS^IFri VIFITAFTAA TSN^AGSIQD SLNEXYGISY 

lOi DAM>rrCAZ'/L FLGrC-WGTFF LTPASSLYGiR KITYFICZ?!/ GLuGAWFAL 

151 \y:3^SDSZ:fiS iLrJC-lSESC AEAQV^LS-S ELi-FAHNLGS vltsyivats 

201 »«3TVIjC-?LIA Ar:v-NIGr« WV'GWIAAIIS GAuLFVIVFC LDETYFCRAK 

251 FTKP 



1 Crr/SSTSTAE A'.">;;:ZIKV':0 SFPOESQAffr SL5D:<I«/SAV ZOIIIMCPLI 
51 A?GGFVFG?0 rCTISGFIJO: Si:r:.SRFGGT KADGTLYroN VTITCLMIGL? 

10 1 KAGCAIGAL? LSr/COMVGR RVGI>r?AMIV YIVGIIVQIA SvHAOTQVMI 

airJsiguities 

X 
s 

Z51 GEIIT-GLAVG >:iS-/lCFLFr SEVSFKKLRG TL'.fCCFQLMl TWtrLGYCT 
fs 

201 T-rG TKSYSCS aQW.IPLGLC FA>XALCLVAG jr^Tl>fPSS?RY LVGKDRISDA 

?R 

251 KM£lAKT>n'rv' SF2:r?A:.YK£ LQUQAG'v SF. HRLAOXASV/G TLFIltllcrKI?' 

IV riss irig sequence 

3 CI SRVMLC^I-Q Ali^rNWGKN LFFSYLTSXP K 



Kiasmg sequence 

1 NAFVSGTITS rLVDVDATVS VGCEZIfCMES GDAPAGGA3A SEAPAXXEEA 

mieeing sequ«r*ce 

51 PSKAKSESA? AA^.rKr'JEETK KiSPI-GCHSKP APKKEESKXS T^STTSAPT? 
niissin^ sequence 
101 TKFSP^EHRV KJC:?,X?.LRZA ERLKSSQNTA ASLTTcNF/D KSNIiMDFRKK 
missin^j 3eq*-er.ce 

151 y:<i5Eri2:<TG iKit-rxc-AFS :<ASAiAL:':i;r PAVNAAiiici d?ivfkdyad 

missing st^isr.r^ X XX :CX NX * 

2Gi ISIAVATPKG LVrrr/TK:^^ 5:*SILGI£6CE ZSiriGKKA.^ GKLXliSDKTG 

S X XX X C X X' X F XF X IX 

251 G^FTrSMGG'/ FC-SlY"?ri N^FOTA'v'LGL KG\^ZR?^"r/ CXGQIVSRPH^ 
301 yiALTYOHRV VZ.-?3-.nVZFu RZIKELIEZP RK^LL 



31 
101 
151 
231 
251 
3G1 
351 
401 
451 
501 
551 
601 
6*51 
701 
751 
8C1 
8Si 



M'/^/YKPJX^aK £7v'R?DKITA KVQPLCYGLN P.SKVSPVAIT CKVISG^rrOG 
VT-TIELDXl?. AIIA.2iTMTTr KPDVAVUtAR IA7£NiHKQT TKQY3Kl/£KO 
LYEYINPKTG L3iS?visxET YDIIMBHSDE LNSAIVYDRD yrn-NYFQrKT 
LERSYLLRIN G3r/A£R?<;KL IMP.VAVC-IHG KDl^K'/lET/ NLMSQRFrTH 
GSPCLFNA'Sr FP.PgMSSCFL LAMKDDSIEG ITOtLKSCAL ISK5AGGIGL 
HIKN2RSTGA rrA3TKGTS:-J GIIPMVBVFN NTARyVDQGG CJKP.POAFAIiY 
LE?VfK30IFD ? IDrRKNHGK SEIRAiCDLrP ALWIPDLFMK R*/2QNGDVrrL 
F5PNEAPGLA tV-fGDEFESL YTXYEKSNRG RQTi:<AQFa.W YAILGACCST 
GTFFMLYXDS CXnCSI^QKKL GIIXSSNLCC Sr/TYSAPDE VAV^CNLASIA 
:.PSyV2ND3K STSTKF CKLK QVTKV^/TPJH* ITO'/XDJ^JiY? VPEASRSKMR 
KRPIALGVQG UOAFMZIRu P7D£QEA?-SI. KIQZFSTIYH AA-ZHASISIA 
KE2GAYETY? GSPASCGLLQ FDLI^RKFTE LWDWDTLXQD LA:<HGMI0.'£L 
LVAPKPTAST S^IlCTniZCF EPYTSKIYSF. aVLAGEFQIV NPYLLF.OIVD 
I.G^y^W2CDA3E-:S SIIAI^'CSIQ ALPiarPDEIK ALYKTVWEIS QKHIICKAAD 
RAAFIDQSQS L-VrHIKOPTX GXLTSriHr/G WKXGLKTGMY 'lUlTQAASAA 
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BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the appHcant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 



CO 

^ □ FADED TEXT OR DRAWING 



□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 
^ □ SKEWED/SLANTED IMAGES 

o 

O □ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem MaUbox. 



